Overview

Brought to you by YData

Dataset statistics

Number of variables138
Number of observations988402
Missing cells69735969
Missing cells (%)51.1%
Total size in memory1.0 GiB
Average record size in memory1.1 KiB

Variable types

Text138

Dataset

DescriptionUS NMNH Extant Specimen Records 0052487-241126133413365
URLhttps://doi.org/10.15468/dl.wttrju

Alerts

license has constant value "CC0_1_0" Constant
publisher has constant value "National Museum of Natural History, Smithsonian Institution" Constant
institutionID has constant value "urn:lsid:biocol.org:col:15463" Constant
collectionID has constant value "urn:uuid:60e28f81-e634-4869-aa3e-732caed713c8" Constant
institutionCode has constant value "US" Constant
collectionCode has constant value "US" Constant
datasetName has constant value "NMNH Extant Biology" Constant
occurrenceStatus has constant value "PRESENT" Constant
verbatimSRS has constant value "1938-11-11" Constant
footprintSRS has constant value "315" Constant
footprintSpatialFit has constant value "315" Constant
georeferencedBy has constant value "1938" Constant
georeferencedDate has constant value "11" Constant
georeferenceSources has constant value "11 Nov 1938" Constant
latestEpochOrHighestSeries has constant value "South America - Neotropics, Colombia, Meta" Constant
earliestAgeOrLowestStage has constant value "SOUTH_AMERICA" Constant
lowestBiostratigraphicZone has constant value "7296210" Constant
lithostratigraphicTerms has constant value "CO" Constant
group has constant value "Meta" Constant
dateIdentified has constant value "Plantae, Dicotyledonae, Malpighiales, Violaceae, Violoideae" Constant
identificationReferences has constant value "Plantae" Constant
identificationVerificationStatus has constant value "Tracheophyta" Constant
identificationRemarks has constant value "Magnoliopsida" Constant
taxonID has constant value "Malpighiales" Constant
namePublishedInID has constant value "Rinorea" Constant
taxonConceptID has constant value "Rinorea" Constant
parentNameUsage has constant value "pubiflora" Constant
originalNameUsage has constant value "pubiflora" Constant
namePublishedIn has constant value "VARIETY" Constant
superfamily has constant value "821cc27a-e3bb-4bc5-ac34-89ada245069d" Constant
subfamily has constant value "2024-12-02T13:57:09.776Z" Constant
tribe has constant value "450.0" Constant
subtribe has constant value "50.0" Constant
infragenericEpithet has constant value "OCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT" Constant
cultivarEpithet has constant value "false" Constant
verbatimTaxonRank has constant value "7296210" Constant
nomenclaturalCode has constant value "7707728" Constant
nomenclaturalStatus has constant value "1414" Constant
taxonRemarks has constant value "6631" Constant
publishingCountry has constant value "US" Constant
subgenusKey has constant value "Magnoliopsida" Constant
protocol has constant value "EML" Constant
projectId has constant value "edulis" Constant
catalogNumber has 132504 (13.4%) missing values Missing
recordedBy has 11879 (1.2%) missing values Missing
lifeStage has 916836 (92.8%) missing values Missing
preparations has 959242 (97.0%) missing values Missing
associatedSequences has 988328 (> 99.9%) missing values Missing
occurrenceRemarks has 968411 (98.0%) missing values Missing
fieldNumber has 988343 (> 99.9%) missing values Missing
eventDate has 119809 (12.1%) missing values Missing
startDayOfYear has 261666 (26.5%) missing values Missing
endDayOfYear has 261666 (26.5%) missing values Missing
year has 122319 (12.4%) missing values Missing
month has 181983 (18.4%) missing values Missing
day has 314697 (31.8%) missing values Missing
verbatimEventDate has 655426 (66.3%) missing values Missing
habitat has 877971 (88.8%) missing values Missing
locationID has 979422 (99.1%) missing values Missing
continent has 32788 (3.3%) missing values Missing
waterBody has 984227 (99.6%) missing values Missing
islandGroup has 963568 (97.5%) missing values Missing
island has 906001 (91.7%) missing values Missing
countryCode has 10855 (1.1%) missing values Missing
stateProvince has 219376 (22.2%) missing values Missing
county has 826754 (83.6%) missing values Missing
locality has 72708 (7.4%) missing values Missing
verbatimDepth has 983702 (99.5%) missing values Missing
decimalLatitude has 841005 (85.1%) missing values Missing
decimalLongitude has 841005 (85.1%) missing values Missing
coordinateUncertaintyInMeters has 987002 (99.9%) missing values Missing
verbatimCoordinateSystem has 980404 (99.2%) missing values Missing
verbatimSRS has 988401 (> 99.9%) missing values Missing
footprintSRS has 988401 (> 99.9%) missing values Missing
footprintSpatialFit has 988401 (> 99.9%) missing values Missing
georeferencedBy has 988401 (> 99.9%) missing values Missing
georeferencedDate has 988401 (> 99.9%) missing values Missing
georeferenceProtocol has 960543 (97.2%) missing values Missing
georeferenceSources has 988401 (> 99.9%) missing values Missing
georeferenceRemarks has 988289 (> 99.9%) missing values Missing
latestEpochOrHighestSeries has 988401 (> 99.9%) missing values Missing
earliestAgeOrLowestStage has 988401 (> 99.9%) missing values Missing
lowestBiostratigraphicZone has 988401 (> 99.9%) missing values Missing
lithostratigraphicTerms has 988401 (> 99.9%) missing values Missing
group has 988401 (> 99.9%) missing values Missing
bed has 988400 (> 99.9%) missing values Missing
identificationQualifier has 985985 (99.8%) missing values Missing
typeStatus has 967033 (97.8%) missing values Missing
identifiedBy has 866335 (87.7%) missing values Missing
dateIdentified has 988401 (> 99.9%) missing values Missing
identificationReferences has 988401 (> 99.9%) missing values Missing
identificationVerificationStatus has 988401 (> 99.9%) missing values Missing
identificationRemarks has 988401 (> 99.9%) missing values Missing
taxonID has 988401 (> 99.9%) missing values Missing
namePublishedInID has 988401 (> 99.9%) missing values Missing
taxonConceptID has 988401 (> 99.9%) missing values Missing
parentNameUsage has 988401 (> 99.9%) missing values Missing
originalNameUsage has 988401 (> 99.9%) missing values Missing
namePublishedIn has 988401 (> 99.9%) missing values Missing
order has 10135 (1.0%) missing values Missing
superfamily has 988401 (> 99.9%) missing values Missing
family has 10432 (1.1%) missing values Missing
subfamily has 988401 (> 99.9%) missing values Missing
tribe has 988401 (> 99.9%) missing values Missing
subtribe has 988401 (> 99.9%) missing values Missing
genus has 15345 (1.6%) missing values Missing
genericName has 15400 (1.6%) missing values Missing
infragenericEpithet has 988401 (> 99.9%) missing values Missing
specificEpithet has 75483 (7.6%) missing values Missing
infraspecificEpithet has 923675 (93.5%) missing values Missing
cultivarEpithet has 988401 (> 99.9%) missing values Missing
verbatimTaxonRank has 988401 (> 99.9%) missing values Missing
vernacularName has 988400 (> 99.9%) missing values Missing
nomenclaturalCode has 988401 (> 99.9%) missing values Missing
nomenclaturalStatus has 988401 (> 99.9%) missing values Missing
taxonRemarks has 988401 (> 99.9%) missing values Missing
elevation has 625728 (63.3%) missing values Missing
elevationAccuracy has 880635 (89.1%) missing values Missing
depth has 979722 (99.1%) missing values Missing
depthAccuracy has 980482 (99.2%) missing values Missing
distanceFromCentroidInMeters has 987807 (99.9%) missing values Missing
mediaType has 69371 (7.0%) missing values Missing
orderKey has 10134 (1.0%) missing values Missing
familyKey has 10432 (1.1%) missing values Missing
genusKey has 15344 (1.6%) missing values Missing
subgenusKey has 988401 (> 99.9%) missing values Missing
speciesKey has 75442 (7.6%) missing values Missing
species has 75443 (7.6%) missing values Missing
projectId has 988401 (> 99.9%) missing values Missing
gbifRegion has 19586 (2.0%) missing values Missing
level0Gid has 854767 (86.5%) missing values Missing
level0Name has 854767 (86.5%) missing values Missing
level1Gid has 855021 (86.5%) missing values Missing
level1Name has 855020 (86.5%) missing values Missing
level2Gid has 859029 (86.9%) missing values Missing
level2Name has 859040 (86.9%) missing values Missing
level3Gid has 953538 (96.5%) missing values Missing
level3Name has 953860 (96.5%) missing values Missing
iucnRedListCategory has 91545 (9.3%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique

Reproduction

Analysis started2025-01-08 22:48:51.675757
Analysis finished2025-01-08 22:49:46.176351
Duration54.5 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct988402
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size7.5 MiB
2025-01-08T17:49:46.712313image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters9884020
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique988402 ?
Unique (%)100.0%

Sample

1st row1320179379
2nd row1675994101
3rd row2592240144
4th row2571494932
5th row3357270605
ValueCountFrequency (%)
1320179379 1
 
< 0.1%
1320208262 1
 
< 0.1%
1320183762 1
 
< 0.1%
1321737296 1
 
< 0.1%
1320181414 1
 
< 0.1%
2592240144 1
 
< 0.1%
2571494932 1
 
< 0.1%
3357270605 1
 
< 0.1%
1321730091 1
 
< 0.1%
1320180447 1
 
< 0.1%
Other values (988392) 988392
> 99.9%
2025-01-08T17:49:47.332201image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1442218
14.6%
2 1379595
14.0%
3 1292646
13.1%
5 938230
9.5%
6 853796
8.6%
4 850434
8.6%
7 817947
8.3%
8 805418
8.1%
0 782405
7.9%
9 721331
7.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9884020
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1442218
14.6%
2 1379595
14.0%
3 1292646
13.1%
5 938230
9.5%
6 853796
8.6%
4 850434
8.6%
7 817947
8.3%
8 805418
8.1%
0 782405
7.9%
9 721331
7.3%

Most occurring scripts

ValueCountFrequency (%)
Common 9884020
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1442218
14.6%
2 1379595
14.0%
3 1292646
13.1%
5 938230
9.5%
6 853796
8.6%
4 850434
8.6%
7 817947
8.3%
8 805418
8.1%
0 782405
7.9%
9 721331
7.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9884020
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1442218
14.6%
2 1379595
14.0%
3 1292646
13.1%
5 938230
9.5%
6 853796
8.6%
4 850434
8.6%
7 817947
8.3%
8 805418
8.1%
0 782405
7.9%
9 721331
7.3%

license
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.5 MiB
2025-01-08T17:49:47.385203image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters6918814
Distinct characters4
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCC0_1_0
2nd rowCC0_1_0
3rd rowCC0_1_0
4th rowCC0_1_0
5th rowCC0_1_0
ValueCountFrequency (%)
cc0_1_0 988402
100.0%
2025-01-08T17:49:47.471673image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 1976804
28.6%
0 1976804
28.6%
_ 1976804
28.6%
1 988402
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2965206
42.9%
Uppercase Letter 1976804
28.6%
Connector Punctuation 1976804
28.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1976804
66.7%
1 988402
33.3%
Uppercase Letter
ValueCountFrequency (%)
C 1976804
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1976804
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4942010
71.4%
Latin 1976804
 
28.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1976804
40.0%
_ 1976804
40.0%
1 988402
20.0%
Latin
ValueCountFrequency (%)
C 1976804
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6918814
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 1976804
28.6%
0 1976804
28.6%
_ 1976804
28.6%
1 988402
14.3%
Distinct103380
Distinct (%)10.5%
Missing0
Missing (%)0.0%
Memory size7.5 MiB
2025-01-08T17:49:47.593174image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters19768040
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39084 ?
Unique (%)4.0%

Sample

1st row2016-08-30T13:42:00Z
2nd row2022-10-26T17:57:00Z
3rd row2020-05-10T23:06:00Z
4th row2020-04-09T11:53:00Z
5th row2021-09-10T21:16:00Z
ValueCountFrequency (%)
2024-10-17t09:48:00z 1536
 
0.2%
2024-10-17t09:52:00z 1531
 
0.2%
2024-10-17t09:51:00z 1451
 
0.1%
2024-10-17t09:55:00z 1419
 
0.1%
2024-10-17t09:49:00z 1377
 
0.1%
2024-10-17t09:50:00z 1314
 
0.1%
2024-10-17t09:53:00z 1255
 
0.1%
2024-10-17t09:54:00z 1248
 
0.1%
2024-10-17t09:57:00z 1194
 
0.1%
2024-10-17t09:56:00z 1136
 
0.1%
Other values (103370) 974941
98.6%
2025-01-08T17:49:47.767641image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 5048546
25.5%
2 2613229
13.2%
1 2474741
12.5%
- 1976804
 
10.0%
: 1976804
 
10.0%
T 988402
 
5.0%
Z 988402
 
5.0%
3 648649
 
3.3%
8 570563
 
2.9%
9 562325
 
2.8%
Other values (4) 1919575
 
9.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 13837628
70.0%
Dash Punctuation 1976804
 
10.0%
Other Punctuation 1976804
 
10.0%
Uppercase Letter 1976804
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 5048546
36.5%
2 2613229
18.9%
1 2474741
17.9%
3 648649
 
4.7%
8 570563
 
4.1%
9 562325
 
4.1%
4 522023
 
3.8%
7 507317
 
3.7%
5 461935
 
3.3%
6 428300
 
3.1%
Uppercase Letter
ValueCountFrequency (%)
T 988402
50.0%
Z 988402
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1976804
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1976804
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 17791236
90.0%
Latin 1976804
 
10.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 5048546
28.4%
2 2613229
14.7%
1 2474741
13.9%
- 1976804
 
11.1%
: 1976804
 
11.1%
3 648649
 
3.6%
8 570563
 
3.2%
9 562325
 
3.2%
4 522023
 
2.9%
7 507317
 
2.9%
Other values (2) 890235
 
5.0%
Latin
ValueCountFrequency (%)
T 988402
50.0%
Z 988402
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19768040
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 5048546
25.5%
2 2613229
13.2%
1 2474741
12.5%
- 1976804
 
10.0%
: 1976804
 
10.0%
T 988402
 
5.0%
Z 988402
 
5.0%
3 648649
 
3.3%
8 570563
 
2.9%
9 562325
 
2.8%
Other values (4) 1919575
 
9.7%

publisher
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.5 MiB
2025-01-08T17:49:47.831539image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length59
Median length59
Mean length59
Min length59

Characters and Unicode

Total characters58315718
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNational Museum of Natural History, Smithsonian Institution
2nd rowNational Museum of Natural History, Smithsonian Institution
3rd rowNational Museum of Natural History, Smithsonian Institution
4th rowNational Museum of Natural History, Smithsonian Institution
5th rowNational Museum of Natural History, Smithsonian Institution
ValueCountFrequency (%)
national 988402
14.3%
museum 988402
14.3%
of 988402
14.3%
natural 988402
14.3%
history 988402
14.3%
smithsonian 988402
14.3%
institution 988402
14.3%
2025-01-08T17:49:47.934904image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 6918814
11.9%
i 5930412
10.2%
5930412
10.2%
a 4942010
 
8.5%
o 4942010
 
8.5%
n 4942010
 
8.5%
s 3953608
 
6.8%
u 3953608
 
6.8%
r 1976804
 
3.4%
m 1976804
 
3.4%
Other values (11) 12849226
22.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 45466492
78.0%
Space Separator 5930412
 
10.2%
Uppercase Letter 5930412
 
10.2%
Other Punctuation 988402
 
1.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 6918814
15.2%
i 5930412
13.0%
a 4942010
10.9%
o 4942010
10.9%
n 4942010
10.9%
s 3953608
8.7%
u 3953608
8.7%
r 1976804
 
4.3%
m 1976804
 
4.3%
l 1976804
 
4.3%
Other values (4) 3953608
8.7%
Uppercase Letter
ValueCountFrequency (%)
N 1976804
33.3%
M 988402
16.7%
H 988402
16.7%
S 988402
16.7%
I 988402
16.7%
Space Separator
ValueCountFrequency (%)
5930412
100.0%
Other Punctuation
ValueCountFrequency (%)
, 988402
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 51396904
88.1%
Common 6918814
 
11.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 6918814
13.5%
i 5930412
11.5%
a 4942010
9.6%
o 4942010
9.6%
n 4942010
9.6%
s 3953608
 
7.7%
u 3953608
 
7.7%
r 1976804
 
3.8%
m 1976804
 
3.8%
N 1976804
 
3.8%
Other values (9) 9884020
19.2%
Common
ValueCountFrequency (%)
5930412
85.7%
, 988402
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 58315718
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 6918814
11.9%
i 5930412
10.2%
5930412
10.2%
a 4942010
 
8.5%
o 4942010
 
8.5%
n 4942010
 
8.5%
s 3953608
 
6.8%
u 3953608
 
6.8%
r 1976804
 
3.4%
m 1976804
 
3.4%
Other values (11) 12849226
22.0%

institutionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.5 MiB
2025-01-08T17:49:47.985902image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length29
Min length29

Characters and Unicode

Total characters28663658
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:lsid:biocol.org:col:15463
2nd rowurn:lsid:biocol.org:col:15463
3rd rowurn:lsid:biocol.org:col:15463
4th rowurn:lsid:biocol.org:col:15463
5th rowurn:lsid:biocol.org:col:15463
ValueCountFrequency (%)
urn:lsid:biocol.org:col:15463 988402
100.0%
2025-01-08T17:49:48.082912image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 3953608
13.8%
: 3953608
13.8%
l 2965206
 
10.3%
i 1976804
 
6.9%
r 1976804
 
6.9%
c 1976804
 
6.9%
g 988402
 
3.4%
6 988402
 
3.4%
4 988402
 
3.4%
5 988402
 
3.4%
Other values (8) 7907216
27.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18779638
65.5%
Other Punctuation 4942010
 
17.2%
Decimal Number 4942010
 
17.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 3953608
21.1%
l 2965206
15.8%
i 1976804
10.5%
r 1976804
10.5%
c 1976804
10.5%
g 988402
 
5.3%
u 988402
 
5.3%
b 988402
 
5.3%
d 988402
 
5.3%
s 988402
 
5.3%
Decimal Number
ValueCountFrequency (%)
6 988402
20.0%
4 988402
20.0%
5 988402
20.0%
1 988402
20.0%
3 988402
20.0%
Other Punctuation
ValueCountFrequency (%)
: 3953608
80.0%
. 988402
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18779638
65.5%
Common 9884020
34.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 3953608
21.1%
l 2965206
15.8%
i 1976804
10.5%
r 1976804
10.5%
c 1976804
10.5%
g 988402
 
5.3%
u 988402
 
5.3%
b 988402
 
5.3%
d 988402
 
5.3%
s 988402
 
5.3%
Common
ValueCountFrequency (%)
: 3953608
40.0%
6 988402
 
10.0%
4 988402
 
10.0%
5 988402
 
10.0%
1 988402
 
10.0%
. 988402
 
10.0%
3 988402
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28663658
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 3953608
13.8%
: 3953608
13.8%
l 2965206
 
10.3%
i 1976804
 
6.9%
r 1976804
 
6.9%
c 1976804
 
6.9%
g 988402
 
3.4%
6 988402
 
3.4%
4 988402
 
3.4%
5 988402
 
3.4%
Other values (8) 7907216
27.6%

collectionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.5 MiB
2025-01-08T17:49:48.134430image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters44478090
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:uuid:60e28f81-e634-4869-aa3e-732caed713c8
2nd rowurn:uuid:60e28f81-e634-4869-aa3e-732caed713c8
3rd rowurn:uuid:60e28f81-e634-4869-aa3e-732caed713c8
4th rowurn:uuid:60e28f81-e634-4869-aa3e-732caed713c8
5th rowurn:uuid:60e28f81-e634-4869-aa3e-732caed713c8
ValueCountFrequency (%)
urn:uuid:60e28f81-e634-4869-aa3e-732caed713c8 988402
100.0%
2025-01-08T17:49:48.235962image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 3953608
 
8.9%
3 3953608
 
8.9%
- 3953608
 
8.9%
e 3953608
 
8.9%
6 2965206
 
6.7%
a 2965206
 
6.7%
u 2965206
 
6.7%
d 1976804
 
4.4%
2 1976804
 
4.4%
1 1976804
 
4.4%
Other values (10) 13837628
31.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 20756442
46.7%
Lowercase Letter 17791236
40.0%
Dash Punctuation 3953608
 
8.9%
Other Punctuation 1976804
 
4.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 3953608
19.0%
3 3953608
19.0%
6 2965206
14.3%
2 1976804
9.5%
1 1976804
9.5%
4 1976804
9.5%
7 1976804
9.5%
0 988402
 
4.8%
9 988402
 
4.8%
Lowercase Letter
ValueCountFrequency (%)
e 3953608
22.2%
a 2965206
16.7%
u 2965206
16.7%
d 1976804
11.1%
c 1976804
11.1%
r 988402
 
5.6%
f 988402
 
5.6%
i 988402
 
5.6%
n 988402
 
5.6%
Dash Punctuation
ValueCountFrequency (%)
- 3953608
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1976804
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 26686854
60.0%
Latin 17791236
40.0%

Most frequent character per script

Common
ValueCountFrequency (%)
8 3953608
14.8%
3 3953608
14.8%
- 3953608
14.8%
6 2965206
11.1%
2 1976804
7.4%
1 1976804
7.4%
: 1976804
7.4%
4 1976804
7.4%
7 1976804
7.4%
0 988402
 
3.7%
Latin
ValueCountFrequency (%)
e 3953608
22.2%
a 2965206
16.7%
u 2965206
16.7%
d 1976804
11.1%
c 1976804
11.1%
r 988402
 
5.6%
f 988402
 
5.6%
i 988402
 
5.6%
n 988402
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 44478090
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 3953608
 
8.9%
3 3953608
 
8.9%
- 3953608
 
8.9%
e 3953608
 
8.9%
6 2965206
 
6.7%
a 2965206
 
6.7%
u 2965206
 
6.7%
d 1976804
 
4.4%
2 1976804
 
4.4%
1 1976804
 
4.4%
Other values (10) 13837628
31.1%

institutionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.5 MiB
2025-01-08T17:49:48.275038image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1976804
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUS
2nd rowUS
3rd rowUS
4th rowUS
5th rowUS
ValueCountFrequency (%)
us 988402
100.0%
2025-01-08T17:49:48.358677image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 988402
50.0%
S 988402
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1976804
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 988402
50.0%
S 988402
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1976804
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 988402
50.0%
S 988402
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1976804
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 988402
50.0%
S 988402
50.0%

collectionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.5 MiB
2025-01-08T17:49:48.396677image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1976804
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUS
2nd rowUS
3rd rowUS
4th rowUS
5th rowUS
ValueCountFrequency (%)
us 988402
100.0%
2025-01-08T17:49:48.482161image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 988402
50.0%
S 988402
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1976804
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 988402
50.0%
S 988402
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1976804
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 988402
50.0%
S 988402
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1976804
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 988402
50.0%
S 988402
50.0%

datasetName
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.5 MiB
2025-01-08T17:49:48.542650image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters18779638
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNMNH Extant Biology
2nd rowNMNH Extant Biology
3rd rowNMNH Extant Biology
4th rowNMNH Extant Biology
5th rowNMNH Extant Biology
ValueCountFrequency (%)
nmnh 988402
33.3%
extant 988402
33.3%
biology 988402
33.3%
2025-01-08T17:49:48.632261image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 1976804
 
10.5%
1976804
 
10.5%
t 1976804
 
10.5%
o 1976804
 
10.5%
M 988402
 
5.3%
H 988402
 
5.3%
E 988402
 
5.3%
x 988402
 
5.3%
a 988402
 
5.3%
n 988402
 
5.3%
Other values (5) 4942010
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10872422
57.9%
Uppercase Letter 5930412
31.6%
Space Separator 1976804
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1976804
18.2%
o 1976804
18.2%
x 988402
9.1%
a 988402
9.1%
n 988402
9.1%
i 988402
9.1%
l 988402
9.1%
g 988402
9.1%
y 988402
9.1%
Uppercase Letter
ValueCountFrequency (%)
N 1976804
33.3%
M 988402
16.7%
H 988402
16.7%
E 988402
16.7%
B 988402
16.7%
Space Separator
ValueCountFrequency (%)
1976804
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16802834
89.5%
Common 1976804
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 1976804
11.8%
t 1976804
11.8%
o 1976804
11.8%
M 988402
 
5.9%
H 988402
 
5.9%
E 988402
 
5.9%
x 988402
 
5.9%
a 988402
 
5.9%
n 988402
 
5.9%
B 988402
 
5.9%
Other values (4) 3953608
23.5%
Common
ValueCountFrequency (%)
1976804
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18779638
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 1976804
 
10.5%
1976804
 
10.5%
t 1976804
 
10.5%
o 1976804
 
10.5%
M 988402
 
5.3%
H 988402
 
5.3%
E 988402
 
5.3%
x 988402
 
5.3%
a 988402
 
5.3%
n 988402
 
5.3%
Other values (5) 4942010
26.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.5 MiB
2025-01-08T17:49:48.679129image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length18
Mean length18.01104712
Min length18

Characters and Unicode

Total characters17802155
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESERVED_SPECIMEN
2nd rowPRESERVED_SPECIMEN
3rd rowPRESERVED_SPECIMEN
4th rowPRESERVED_SPECIMEN
5th rowPRESERVED_SPECIMEN
ValueCountFrequency (%)
preserved_specimen 977483
98.9%
machine_observation 10919
 
1.1%
2025-01-08T17:49:48.777611image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 4909253
27.6%
R 1965885
11.0%
S 1965885
11.0%
P 1954966
 
11.0%
I 999321
 
5.6%
N 999321
 
5.6%
V 988402
 
5.6%
_ 988402
 
5.6%
C 988402
 
5.6%
M 988402
 
5.6%
Other values (6) 1053916
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 16813753
94.4%
Connector Punctuation 988402
 
5.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 4909253
29.2%
R 1965885
11.7%
S 1965885
11.7%
P 1954966
 
11.6%
I 999321
 
5.9%
N 999321
 
5.9%
V 988402
 
5.9%
C 988402
 
5.9%
M 988402
 
5.9%
D 977483
 
5.8%
Other values (5) 76433
 
0.5%
Connector Punctuation
ValueCountFrequency (%)
_ 988402
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16813753
94.4%
Common 988402
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 4909253
29.2%
R 1965885
11.7%
S 1965885
11.7%
P 1954966
 
11.6%
I 999321
 
5.9%
N 999321
 
5.9%
V 988402
 
5.9%
C 988402
 
5.9%
M 988402
 
5.9%
D 977483
 
5.8%
Other values (5) 76433
 
0.5%
Common
ValueCountFrequency (%)
_ 988402
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17802155
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 4909253
27.6%
R 1965885
11.0%
S 1965885
11.0%
P 1954966
 
11.0%
I 999321
 
5.6%
N 999321
 
5.6%
V 988402
 
5.6%
_ 988402
 
5.6%
C 988402
 
5.6%
M 988402
 
5.6%
Other values (6) 1053916
 
5.9%

occurrenceID
Text

Unique 

Distinct988402
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size7.5 MiB
2025-01-08T17:49:49.244900image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length63
Mean length63
Min length63

Characters and Unicode

Total characters62269326
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique988402 ?
Unique (%)100.0%

Sample

1st rowhttp://n2t.net/ark:/65665/383aab1ce-8b35-4007-8eba-472b592b7a99
2nd rowhttp://n2t.net/ark:/65665/3c8351e79-8b3b-4df0-80be-cb019ba60185
3rd rowhttp://n2t.net/ark:/65665/3c8377593-a51b-4b6a-835d-649053b2ef0f
4th rowhttp://n2t.net/ark:/65665/383b388e9-b7cc-4b41-95cc-e0a1b092179a
5th rowhttp://n2t.net/ark:/65665/3c83e5abc-b64e-45a4-aa42-faf5abc93792
ValueCountFrequency (%)
http://n2t.net/ark:/65665/383aab1ce-8b35-4007-8eba-472b592b7a99 1
 
< 0.1%
http://n2t.net/ark:/65665/384f1e549-df99-4ba7-87ad-271f1281c0f1 1
 
< 0.1%
http://n2t.net/ark:/65665/383de17db-7e58-4b17-9277-c255affc4cdb 1
 
< 0.1%
http://n2t.net/ark:/65665/3c8910e37-f290-4195-961c-13f0efedc290 1
 
< 0.1%
http://n2t.net/ark:/65665/383c2daca-1de6-4e79-9d0c-f3b5838bafb2 1
 
< 0.1%
http://n2t.net/ark:/65665/3c8377593-a51b-4b6a-835d-649053b2ef0f 1
 
< 0.1%
http://n2t.net/ark:/65665/383b388e9-b7cc-4b41-95cc-e0a1b092179a 1
 
< 0.1%
http://n2t.net/ark:/65665/3c83e5abc-b64e-45a4-aa42-faf5abc93792 1
 
< 0.1%
http://n2t.net/ark:/65665/3c83f60ef-2f0d-451e-986a-e0c2dfb03675 1
 
< 0.1%
http://n2t.net/ark:/65665/383b6d73e-eb70-4b52-81b8-336878ca92f0 1
 
< 0.1%
Other values (988392) 988392
> 99.9%
2025-01-08T17:49:49.784194image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 4942010
 
7.9%
6 4818872
 
7.7%
- 3953608
 
6.3%
t 3953608
 
6.3%
5 3831724
 
6.2%
a 3087982
 
5.0%
4 2844631
 
4.6%
e 2842218
 
4.6%
2 2841949
 
4.6%
3 2841551
 
4.6%
Other values (16) 26311173
42.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 26932854
43.3%
Lowercase Letter 23475648
37.7%
Other Punctuation 7907216
 
12.7%
Dash Punctuation 3953608
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 3953608
16.8%
a 3087982
13.2%
e 2842218
12.1%
b 2099835
8.9%
n 1976804
8.4%
f 1854245
7.9%
d 1853885
7.9%
c 1853463
7.9%
k 988402
 
4.2%
r 988402
 
4.2%
Other values (2) 1976804
8.4%
Decimal Number
ValueCountFrequency (%)
6 4818872
17.9%
5 3831724
14.2%
4 2844631
10.6%
2 2841949
10.6%
3 2841551
10.6%
9 2101095
7.8%
8 2097885
7.8%
7 1853396
 
6.9%
0 1851151
 
6.9%
1 1850600
 
6.9%
Other Punctuation
ValueCountFrequency (%)
/ 4942010
62.5%
: 1976804
 
25.0%
. 988402
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 3953608
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 38793678
62.3%
Latin 23475648
37.7%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 4942010
12.7%
6 4818872
12.4%
- 3953608
10.2%
5 3831724
9.9%
4 2844631
7.3%
2 2841949
7.3%
3 2841551
7.3%
9 2101095
 
5.4%
8 2097885
 
5.4%
: 1976804
 
5.1%
Other values (4) 6543549
16.9%
Latin
ValueCountFrequency (%)
t 3953608
16.8%
a 3087982
13.2%
e 2842218
12.1%
b 2099835
8.9%
n 1976804
8.4%
f 1854245
7.9%
d 1853885
7.9%
c 1853463
7.9%
k 988402
 
4.2%
r 988402
 
4.2%
Other values (2) 1976804
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 62269326
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 4942010
 
7.9%
6 4818872
 
7.7%
- 3953608
 
6.3%
t 3953608
 
6.3%
5 3831724
 
6.2%
a 3087982
 
5.0%
4 2844631
 
4.6%
e 2842218
 
4.6%
2 2841949
 
4.6%
3 2841551
 
4.6%
Other values (16) 26311173
42.3%

catalogNumber
Text

Missing 

Distinct843685
Distinct (%)98.6%
Missing132504
Missing (%)13.4%
Memory size7.5 MiB
2025-01-08T17:49:50.343895image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length10
Mean length9.636065279
Min length4

Characters and Unicode

Total characters8247489
Distinct characters46
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique832040 ?
Unique (%)97.2%

Sample

1st rowUS 213621
2nd rowUS 2144946
3rd rowUS 3113222
4th rowUS 2583825
5th rowUS 3026466
ValueCountFrequency (%)
us 846588
49.7%
sem 52
 
< 0.1%
1 35
 
< 0.1%
27
 
< 0.1%
micrograph 26
 
< 0.1%
stub 26
 
< 0.1%
3 15
 
< 0.1%
2 13
 
< 0.1%
169920 12
 
< 0.1%
95489 9
 
< 0.1%
Other values (843649) 855865
50.3%
2025-01-08T17:49:50.977541image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 855977
10.4%
U 855899
10.4%
846770
10.3%
2 752400
9.1%
1 735998
8.9%
3 670736
8.1%
5 512863
 
6.2%
4 511349
 
6.2%
6 510981
 
6.2%
7 501756
 
6.1%
Other values (36) 1492760
18.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5660089
68.6%
Uppercase Letter 1725893
 
20.9%
Space Separator 846770
 
10.3%
Lowercase Letter 9731
 
0.1%
Dash Punctuation 4981
 
0.1%
Close Punctuation 10
 
< 0.1%
Open Punctuation 10
 
< 0.1%
Other Punctuation 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
w 9311
95.7%
r 53
 
0.5%
a 42
 
0.4%
u 36
 
0.4%
p 30
 
0.3%
b 29
 
0.3%
i 28
 
0.3%
o 27
 
0.3%
m 27
 
0.3%
c 27
 
0.3%
Other values (10) 121
 
1.2%
Uppercase Letter
ValueCountFrequency (%)
S 855977
49.6%
U 855899
49.6%
D 8488
 
0.5%
A 5354
 
0.3%
E 73
 
< 0.1%
M 52
 
< 0.1%
P 21
 
< 0.1%
B 18
 
< 0.1%
L 10
 
< 0.1%
V 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 752400
13.3%
1 735998
13.0%
3 670736
11.9%
5 512863
9.1%
4 511349
9.0%
6 510981
9.0%
7 501756
8.9%
0 492646
8.7%
9 485776
8.6%
8 485584
8.6%
Other Punctuation
ValueCountFrequency (%)
. 4
80.0%
? 1
 
20.0%
Space Separator
ValueCountFrequency (%)
846770
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4981
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6511865
79.0%
Latin 1735624
 
21.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 855977
49.3%
U 855899
49.3%
w 9311
 
0.5%
D 8488
 
0.5%
A 5354
 
0.3%
E 73
 
< 0.1%
r 53
 
< 0.1%
M 52
 
< 0.1%
a 42
 
< 0.1%
u 36
 
< 0.1%
Other values (20) 339
 
< 0.1%
Common
ValueCountFrequency (%)
846770
13.0%
2 752400
11.6%
1 735998
11.3%
3 670736
10.3%
5 512863
7.9%
4 511349
7.9%
6 510981
7.8%
7 501756
7.7%
0 492646
7.6%
9 485776
7.5%
Other values (6) 490590
7.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8247489
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 855977
10.4%
U 855899
10.4%
846770
10.3%
2 752400
9.1%
1 735998
8.9%
3 670736
8.1%
5 512863
 
6.2%
4 511349
 
6.2%
6 510981
 
6.2%
7 501756
 
6.1%
Other values (36) 1492760
18.1%
Distinct163293
Distinct (%)16.7%
Missing8698
Missing (%)0.9%
Memory size7.5 MiB
2025-01-08T17:49:51.181101image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length90
Median length72
Mean length4.48941415
Min length1

Characters and Unicode

Total characters4398297
Distinct characters109
Distinct categories14 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique114501 ?
Unique (%)11.7%

Sample

1st rowBLM-210-IV-11-B-TDS
2nd row4319
3rd row2429
4th row95426
5th row1414/512
ValueCountFrequency (%)
s.n 141397
 
13.6%
bureau 4447
 
0.4%
eyd 3365
 
0.3%
s 3110
 
0.3%
n 3006
 
0.3%
of 2991
 
0.3%
science 2898
 
0.3%
d&ml 2806
 
0.3%
2716
 
0.3%
h 1941
 
0.2%
Other values (128797) 872266
83.8%
2025-01-08T17:49:51.444858image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 527431
12.0%
2 407354
9.3%
3 350940
 
8.0%
4 329711
 
7.5%
5 316988
 
7.2%
0 316877
 
7.2%
6 306238
 
7.0%
. 298537
 
6.8%
7 287805
 
6.5%
8 276699
 
6.3%
Other values (99) 979717
22.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3389386
77.1%
Lowercase Letter 401596
 
9.1%
Other Punctuation 316346
 
7.2%
Uppercase Letter 158999
 
3.6%
Dash Punctuation 66578
 
1.5%
Space Separator 61239
 
1.4%
Open Punctuation 1749
 
< 0.1%
Close Punctuation 1740
 
< 0.1%
Other Number 383
 
< 0.1%
Connector Punctuation 142
 
< 0.1%
Other values (4) 139
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 151013
37.6%
s 148056
36.9%
a 16112
 
4.0%
e 15277
 
3.8%
u 10511
 
2.6%
r 10123
 
2.5%
c 8774
 
2.2%
o 8180
 
2.0%
i 7692
 
1.9%
t 6218
 
1.5%
Other values (26) 19640
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
A 15372
 
9.7%
B 15337
 
9.6%
S 14472
 
9.1%
D 11952
 
7.5%
H 10454
 
6.6%
L 9038
 
5.7%
M 8771
 
5.5%
E 8695
 
5.5%
I 7475
 
4.7%
N 6748
 
4.2%
Other values (18) 50685
31.9%
Other Punctuation
ValueCountFrequency (%)
. 298537
94.4%
& 5375
 
1.7%
/ 5177
 
1.6%
* 3127
 
1.0%
? 2292
 
0.7%
, 1072
 
0.3%
! 476
 
0.2%
# 84
 
< 0.1%
; 74
 
< 0.1%
: 68
 
< 0.1%
Other values (5) 64
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 527431
15.6%
2 407354
12.0%
3 350940
10.4%
4 329711
9.7%
5 316988
9.4%
0 316877
9.3%
6 306238
9.0%
7 287805
8.5%
8 276699
8.2%
9 269343
7.9%
Other Number
ValueCountFrequency (%)
½ 368
96.1%
² 6
 
1.6%
¼ 5
 
1.3%
¾ 3
 
0.8%
1
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 1585
90.6%
[ 105
 
6.0%
{ 59
 
3.4%
Close Punctuation
ValueCountFrequency (%)
) 1577
90.6%
] 104
 
6.0%
} 59
 
3.4%
Math Symbol
ValueCountFrequency (%)
= 112
82.4%
+ 23
 
16.9%
~ 1
 
0.7%
Dash Punctuation
ValueCountFrequency (%)
- 66578
100.0%
Space Separator
ValueCountFrequency (%)
61239
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 142
100.0%
Modifier Letter
ValueCountFrequency (%)
ˍ 1
100.0%
Other Letter
ValueCountFrequency (%)
ª 1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3837701
87.3%
Latin 560595
 
12.7%
Greek 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 151013
26.9%
s 148056
26.4%
a 16112
 
2.9%
A 15372
 
2.7%
B 15337
 
2.7%
e 15277
 
2.7%
S 14472
 
2.6%
D 11952
 
2.1%
u 10511
 
1.9%
H 10454
 
1.9%
Other values (54) 152039
27.1%
Common
ValueCountFrequency (%)
1 527431
13.7%
2 407354
10.6%
3 350940
9.1%
4 329711
8.6%
5 316988
8.3%
0 316877
8.3%
6 306238
8.0%
. 298537
7.8%
7 287805
7.5%
8 276699
7.2%
Other values (34) 419121
10.9%
Greek
ValueCountFrequency (%)
Σ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4397881
> 99.9%
None 412
 
< 0.1%
Punctuation 2
 
< 0.1%
Modifier Letters 1
 
< 0.1%
Number Forms 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 527431
12.0%
2 407354
9.3%
3 350940
 
8.0%
4 329711
 
7.5%
5 316988
 
7.2%
0 316877
 
7.2%
6 306238
 
7.0%
. 298537
 
6.8%
7 287805
 
6.5%
8 276699
 
6.3%
Other values (78) 979301
22.3%
None
ValueCountFrequency (%)
½ 368
89.3%
è 11
 
2.7%
² 6
 
1.5%
¼ 5
 
1.2%
ü 4
 
1.0%
¾ 3
 
0.7%
é 3
 
0.7%
ú 2
 
0.5%
ó 2
 
0.5%
á 1
 
0.2%
Other values (7) 7
 
1.7%
Modifier Letters
ValueCountFrequency (%)
ˍ 1
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%

recordedBy
Text

Missing 

Distinct71729
Distinct (%)7.3%
Missing11879
Missing (%)1.2%
Memory size7.5 MiB
2025-01-08T17:49:51.629427image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length201
Median length155
Mean length17.24259848
Min length1

Characters and Unicode

Total characters16837794
Distinct characters140
Distinct categories13 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36179 ?
Unique (%)3.7%

Sample

1st rowContinental Shelf Associates for the MMS/BLM
2nd rowJ. Soukup
3rd rowI. Morel
4th rowJ. Steyermark & Cora Steyermark
5th rowA. Oakes & -. Ellis
ValueCountFrequency (%)
273336
 
7.3%
j 195095
 
5.2%
a 167294
 
4.5%
r 148560
 
4.0%
e 148212
 
4.0%
c 138644
 
3.7%
m 133736
 
3.6%
h 120329
 
3.2%
l 97924
 
2.6%
w 96924
 
2.6%
Other values (28460) 2203562
59.2%
2025-01-08T17:49:51.888740image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2747093
16.3%
. 2002049
 
11.9%
e 1082546
 
6.4%
r 792877
 
4.7%
a 787773
 
4.7%
o 664253
 
3.9%
n 660896
 
3.9%
l 631169
 
3.7%
i 544852
 
3.2%
t 439669
 
2.6%
Other values (130) 6484617
38.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8170236
48.5%
Uppercase Letter 3425777
20.3%
Space Separator 2747093
 
16.3%
Other Punctuation 2413800
 
14.3%
Dash Punctuation 73409
 
0.4%
Decimal Number 2960
 
< 0.1%
Close Punctuation 2241
 
< 0.1%
Open Punctuation 2241
 
< 0.1%
Math Symbol 21
 
< 0.1%
Modifier Symbol 8
 
< 0.1%
Other values (3) 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1082546
13.2%
r 792877
9.7%
a 787773
9.6%
o 664253
 
8.1%
n 660896
 
8.1%
l 631169
 
7.7%
i 544852
 
6.7%
t 439669
 
5.4%
s 416260
 
5.1%
u 253955
 
3.1%
Other values (60) 1895986
23.2%
Uppercase Letter
ValueCountFrequency (%)
C 262797
 
7.7%
S 258602
 
7.5%
M 242234
 
7.1%
R 242073
 
7.1%
H 238630
 
7.0%
A 234211
 
6.8%
J 232414
 
6.8%
E 190208
 
5.6%
B 181150
 
5.3%
L 176618
 
5.2%
Other values (29) 1166840
34.1%
Other Punctuation
ValueCountFrequency (%)
. 2002049
82.9%
& 241593
 
10.0%
, 166145
 
6.9%
' 2858
 
0.1%
/ 830
 
< 0.1%
" 314
 
< 0.1%
? 6
 
< 0.1%
; 3
 
< 0.1%
: 1
 
< 0.1%
¡ 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
8 560
18.9%
1 529
17.9%
9 528
17.8%
0 390
13.2%
4 297
10.0%
3 275
9.3%
5 246
8.3%
2 89
 
3.0%
7 45
 
1.5%
6 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 2213
98.8%
] 28
 
1.2%
Open Punctuation
ValueCountFrequency (%)
( 2213
98.8%
[ 28
 
1.2%
Space Separator
ValueCountFrequency (%)
2747093
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 73409
100.0%
Math Symbol
ValueCountFrequency (%)
= 21
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 8
100.0%
Other Symbol
ValueCountFrequency (%)
° 4
100.0%
Final Punctuation
ValueCountFrequency (%)
» 2
100.0%
Initial Punctuation
ValueCountFrequency (%)
« 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11596013
68.9%
Common 5241781
31.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1082546
 
9.3%
r 792877
 
6.8%
a 787773
 
6.8%
o 664253
 
5.7%
n 660896
 
5.7%
l 631169
 
5.4%
i 544852
 
4.7%
t 439669
 
3.8%
s 416260
 
3.6%
C 262797
 
2.3%
Other values (99) 5312921
45.8%
Common
ValueCountFrequency (%)
2747093
52.4%
. 2002049
38.2%
& 241593
 
4.6%
, 166145
 
3.2%
- 73409
 
1.4%
' 2858
 
0.1%
) 2213
 
< 0.1%
( 2213
 
< 0.1%
/ 830
 
< 0.1%
8 560
 
< 0.1%
Other values (21) 2818
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16783092
99.7%
None 54701
 
0.3%
IPA Ext 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2747093
16.4%
. 2002049
 
11.9%
e 1082546
 
6.5%
r 792877
 
4.7%
a 787773
 
4.7%
o 664253
 
4.0%
n 660896
 
3.9%
l 631169
 
3.8%
i 544852
 
3.2%
t 439669
 
2.6%
Other values (68) 6429915
38.3%
None
ValueCountFrequency (%)
é 9407
17.2%
á 9291
17.0%
ó 8471
15.5%
í 6255
11.4%
ñ 5412
9.9%
è 3789
6.9%
ü 3008
 
5.5%
ö 2276
 
4.2%
ê 1495
 
2.7%
ä 701
 
1.3%
Other values (51) 4596
8.4%
IPA Ext
ValueCountFrequency (%)
ɶ 1
100.0%
Distinct18
Distinct (%)< 0.1%
Missing117
Missing (%)< 0.1%
Memory size7.5 MiB
2025-01-08T17:49:51.948113image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.00001113
Min length1

Characters and Unicode

Total characters988296
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 988001
> 99.9%
2 114
 
< 0.1%
0 64
 
< 0.1%
3 35
 
< 0.1%
4 26
 
< 0.1%
5 14
 
< 0.1%
6 8
 
< 0.1%
9 5
 
< 0.1%
7 5
 
< 0.1%
11 3
 
< 0.1%
Other values (8) 10
 
< 0.1%
2025-01-08T17:49:52.045775image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 988014
> 99.9%
2 117
 
< 0.1%
0 66
 
< 0.1%
3 35
 
< 0.1%
4 27
 
< 0.1%
5 14
 
< 0.1%
6 9
 
< 0.1%
9 6
 
< 0.1%
7 5
 
< 0.1%
8 3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 988296
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 988014
> 99.9%
2 117
 
< 0.1%
0 66
 
< 0.1%
3 35
 
< 0.1%
4 27
 
< 0.1%
5 14
 
< 0.1%
6 9
 
< 0.1%
9 6
 
< 0.1%
7 5
 
< 0.1%
8 3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 988296
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 988014
> 99.9%
2 117
 
< 0.1%
0 66
 
< 0.1%
3 35
 
< 0.1%
4 27
 
< 0.1%
5 14
 
< 0.1%
6 9
 
< 0.1%
9 6
 
< 0.1%
7 5
 
< 0.1%
8 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 988296
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 988014
> 99.9%
2 117
 
< 0.1%
0 66
 
< 0.1%
3 35
 
< 0.1%
4 27
 
< 0.1%
5 14
 
< 0.1%
6 9
 
< 0.1%
9 6
 
< 0.1%
7 5
 
< 0.1%
8 3
 
< 0.1%

lifeStage
Text

Missing 

Distinct3
Distinct (%)< 0.1%
Missing916836
Missing (%)92.8%
Memory size7.5 MiB
2025-01-08T17:49:52.087775image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length9
Mean length8.755330744
Min length8

Characters and Unicode

Total characters626584
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFruiting
2nd rowFlowering
3rd rowFlowering
4th rowFlowering
5th rowFlowering
ValueCountFrequency (%)
flowering 43566
60.9%
fruiting 22755
31.8%
vegetative 5245
 
7.3%
2025-01-08T17:49:52.188349image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 94321
15.1%
g 71566
11.4%
F 66321
10.6%
r 66321
10.6%
n 66321
10.6%
e 59301
9.5%
l 43566
7.0%
o 43566
7.0%
w 43566
7.0%
t 33245
 
5.3%
Other values (4) 38490
6.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 555018
88.6%
Uppercase Letter 71566
 
11.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 94321
17.0%
g 71566
12.9%
r 66321
11.9%
n 66321
11.9%
e 59301
10.7%
l 43566
7.8%
o 43566
7.8%
w 43566
7.8%
t 33245
 
6.0%
u 22755
 
4.1%
Other values (2) 10490
 
1.9%
Uppercase Letter
ValueCountFrequency (%)
F 66321
92.7%
V 5245
 
7.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 626584
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 94321
15.1%
g 71566
11.4%
F 66321
10.6%
r 66321
10.6%
n 66321
10.6%
e 59301
9.5%
l 43566
7.0%
o 43566
7.0%
w 43566
7.0%
t 33245
 
5.3%
Other values (4) 38490
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 626584
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 94321
15.1%
g 71566
11.4%
F 66321
10.6%
r 66321
10.6%
n 66321
10.6%
e 59301
9.5%
l 43566
7.0%
o 43566
7.0%
w 43566
7.0%
t 33245
 
5.3%
Other values (4) 38490
6.1%

occurrenceStatus
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.5 MiB
2025-01-08T17:49:52.227534image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters6918814
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESENT
2nd rowPRESENT
3rd rowPRESENT
4th rowPRESENT
5th rowPRESENT
ValueCountFrequency (%)
present 988402
100.0%
2025-01-08T17:49:52.314440image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 1976804
28.6%
P 988402
14.3%
R 988402
14.3%
S 988402
14.3%
N 988402
14.3%
T 988402
14.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 6918814
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 1976804
28.6%
P 988402
14.3%
R 988402
14.3%
S 988402
14.3%
N 988402
14.3%
T 988402
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 6918814
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 1976804
28.6%
P 988402
14.3%
R 988402
14.3%
S 988402
14.3%
N 988402
14.3%
T 988402
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6918814
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 1976804
28.6%
P 988402
14.3%
R 988402
14.3%
S 988402
14.3%
N 988402
14.3%
T 988402
14.3%

preparations
Text

Missing 

Distinct77
Distinct (%)0.3%
Missing959242
Missing (%)97.0%
Memory size7.5 MiB
2025-01-08T17:49:52.376657image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length142
Median length94
Mean length13.18954047
Min length3

Characters and Unicode

Total characters384607
Distinct characters43
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)0.1%

Sample

1st rowWood Sample
2nd rowPhotograph
3rd rowMicroslide
4th rowPhotograph
5th rowPhotograph; Photograph
ValueCountFrequency (%)
wood 9236
18.9%
sample 9236
18.9%
microslide 8980
18.3%
photograph 7481
15.3%
individual 4028
8.2%
strewn 2184
 
4.5%
sem 1492
 
3.0%
micrograph 1411
 
2.9%
ink 1139
 
2.3%
and 637
 
1.3%
Other values (48) 3129
 
6.4%
2025-01-08T17:49:52.507070image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 44561
 
11.6%
i 33828
 
8.8%
d 27045
 
7.0%
a 23965
 
6.2%
l 23811
 
6.2%
r 23041
 
6.0%
e 21867
 
5.7%
19793
 
5.1%
p 18828
 
4.9%
h 16434
 
4.3%
Other values (33) 131434
34.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 310828
80.8%
Uppercase Letter 41028
 
10.7%
Space Separator 19793
 
5.1%
Close Punctuation 6212
 
1.6%
Open Punctuation 6212
 
1.6%
Other Punctuation 534
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 44561
14.3%
i 33828
10.9%
d 27045
8.7%
a 23965
 
7.7%
l 23811
 
7.7%
r 23041
 
7.4%
e 21867
 
7.0%
p 18828
 
6.1%
h 16434
 
5.3%
s 11393
 
3.7%
Other values (16) 66055
21.3%
Uppercase Letter
ValueCountFrequency (%)
S 10846
26.4%
M 10477
25.5%
W 9389
22.9%
P 7465
18.2%
E 1537
 
3.7%
B 589
 
1.4%
I 507
 
1.2%
F 147
 
0.4%
D 52
 
0.1%
T 16
 
< 0.1%
Other values (2) 3
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
; 527
98.7%
, 7
 
1.3%
Space Separator
ValueCountFrequency (%)
19793
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6212
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6212
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 351856
91.5%
Common 32751
 
8.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 44561
12.7%
i 33828
 
9.6%
d 27045
 
7.7%
a 23965
 
6.8%
l 23811
 
6.8%
r 23041
 
6.5%
e 21867
 
6.2%
p 18828
 
5.4%
h 16434
 
4.7%
s 11393
 
3.2%
Other values (28) 107083
30.4%
Common
ValueCountFrequency (%)
19793
60.4%
) 6212
 
19.0%
( 6212
 
19.0%
; 527
 
1.6%
, 7
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 384607
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 44561
 
11.6%
i 33828
 
8.8%
d 27045
 
7.0%
a 23965
 
6.2%
l 23811
 
6.2%
r 23041
 
6.0%
e 21867
 
5.7%
19793
 
5.1%
p 18828
 
4.9%
h 16434
 
4.3%
Other values (33) 131434
34.2%

associatedSequences
Text

Missing 

Distinct73
Distinct (%)98.6%
Missing988328
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:49:52.585386image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length249
Median length199
Mean length146.972973
Min length49

Characters and Unicode

Total characters10876
Distinct characters48
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique72 ?
Unique (%)97.3%

Sample

1st rowhttps://www.ncbi.nlm.nih.gov/gquery?term=ON553270
2nd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=MT553291
3rd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=MT553246
4th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=MT553283
5th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=EU527211;https://www.ncbi.nlm.nih.gov/gquery?term=EU527308;https://www.ncbi.nlm.nih.gov/gquery?term=EU527261
ValueCountFrequency (%)
https://www.ncbi.nlm.nih.gov/gquery?term=jn837181;https://www.ncbi.nlm.nih.gov/gquery?term=jn837465;https://www.ncbi.nlm.nih.gov/gquery?term=jn837361;https://www.ncbi.nlm.nih.gov/gquery?term=jn837271 2
 
2.7%
https://www.ncbi.nlm.nih.gov/gquery?term=jn837116;https://www.ncbi.nlm.nih.gov/gquery?term=jn837405;https://www.ncbi.nlm.nih.gov/gquery?term=jn837297;https://www.ncbi.nlm.nih.gov/gquery?term=jn837206 1
 
1.4%
https://www.ncbi.nlm.nih.gov/gquery?term=mt553291 1
 
1.4%
https://www.ncbi.nlm.nih.gov/gquery?term=mt553246 1
 
1.4%
https://www.ncbi.nlm.nih.gov/gquery?term=mt553283 1
 
1.4%
https://www.ncbi.nlm.nih.gov/gquery?term=eu527211;https://www.ncbi.nlm.nih.gov/gquery?term=eu527308;https://www.ncbi.nlm.nih.gov/gquery?term=eu527261 1
 
1.4%
https://www.ncbi.nlm.nih.gov/gquery?term=kf989590;https://www.ncbi.nlm.nih.gov/gquery?term=kf989809;https://www.ncbi.nlm.nih.gov/gquery?term=kf990009;https://www.ncbi.nlm.nih.gov/gquery?term=kf989698 1
 
1.4%
https://www.ncbi.nlm.nih.gov/gquery?term=jn837113;https://www.ncbi.nlm.nih.gov/gquery?term=jn837294;https://www.ncbi.nlm.nih.gov/gquery?term=jn837203 1
 
1.4%
https://www.ncbi.nlm.nih.gov/gquery?term=kc986936 1
 
1.4%
https://www.ncbi.nlm.nih.gov/gquery?term=eu527225;https://www.ncbi.nlm.nih.gov/gquery?term=eu527322;https://www.ncbi.nlm.nih.gov/gquery?term=eu527275 1
 
1.4%
Other values (63) 63
85.1%
2025-01-08T17:49:52.713827image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 876
 
8.1%
t 657
 
6.0%
/ 657
 
6.0%
w 657
 
6.0%
n 657
 
6.0%
h 438
 
4.0%
i 438
 
4.0%
r 438
 
4.0%
e 438
 
4.0%
g 438
 
4.0%
Other values (38) 5182
47.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6789
62.4%
Other Punctuation 2116
 
19.5%
Decimal Number 1314
 
12.1%
Uppercase Letter 438
 
4.0%
Math Symbol 219
 
2.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 657
 
9.7%
w 657
 
9.7%
n 657
 
9.7%
h 438
 
6.5%
i 438
 
6.5%
r 438
 
6.5%
e 438
 
6.5%
g 438
 
6.5%
m 438
 
6.5%
l 219
 
3.2%
Other values (9) 1971
29.0%
Uppercase Letter
ValueCountFrequency (%)
K 99
22.6%
F 97
22.1%
N 66
15.1%
J 57
13.0%
E 24
 
5.5%
U 24
 
5.5%
M 21
 
4.8%
T 17
 
3.9%
Y 10
 
2.3%
A 9
 
2.1%
Other values (3) 14
 
3.2%
Decimal Number
ValueCountFrequency (%)
9 271
20.6%
8 198
15.1%
3 159
12.1%
7 159
12.1%
5 139
10.6%
2 121
9.2%
0 75
 
5.7%
1 74
 
5.6%
6 65
 
4.9%
4 53
 
4.0%
Other Punctuation
ValueCountFrequency (%)
. 876
41.4%
/ 657
31.0%
: 219
 
10.3%
? 219
 
10.3%
; 145
 
6.9%
Math Symbol
ValueCountFrequency (%)
= 219
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7227
66.4%
Common 3649
33.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 657
 
9.1%
w 657
 
9.1%
n 657
 
9.1%
h 438
 
6.1%
i 438
 
6.1%
r 438
 
6.1%
e 438
 
6.1%
g 438
 
6.1%
m 438
 
6.1%
l 219
 
3.0%
Other values (22) 2409
33.3%
Common
ValueCountFrequency (%)
. 876
24.0%
/ 657
18.0%
9 271
 
7.4%
: 219
 
6.0%
= 219
 
6.0%
? 219
 
6.0%
8 198
 
5.4%
3 159
 
4.4%
7 159
 
4.4%
; 145
 
4.0%
Other values (6) 527
14.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10876
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 876
 
8.1%
t 657
 
6.0%
/ 657
 
6.0%
w 657
 
6.0%
n 657
 
6.0%
h 438
 
4.0%
i 438
 
4.0%
r 438
 
4.0%
e 438
 
4.0%
g 438
 
4.0%
Other values (38) 5182
47.6%

occurrenceRemarks
Text

Missing 

Distinct7579
Distinct (%)37.9%
Missing968411
Missing (%)98.0%
Memory size7.5 MiB
2025-01-08T17:49:52.887243image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4263
Median length2214
Mean length78.05767595
Min length1

Characters and Unicode

Total characters1560451
Distinct characters123
Distinct categories15 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6837 ?
Unique (%)34.2%

Sample

1st rowReceived as: seed
2nd rowTranscribed by digital volunteers
3rd rowBRG
4th rowTranscribed by digital volunteers; Original spelling as annotated and published is "subplebeia". Same (?) taxon re-published in Contr. U.S. Natl. Herb. 17: 46 (1913) with more explicit type citation. Unclear whether Lecidea subplebeia is a later homonym of Lecidea subplebeja Vain. (1890); Lecidea austrocalifornica Zahlbr. published as replacement name but citing Lecidea "subplebeja Nyl. apud Hasse". The latter name is superfluous if the original name is not a later homonym.
5th rowUS, NY
ValueCountFrequency (%)
by 8401
 
3.7%
transcribed 6608
 
2.9%
digital 6534
 
2.9%
volunteers 6533
 
2.9%
4336
 
1.9%
of 3855
 
1.7%
us 3164
 
1.4%
as 3111
 
1.4%
and 2908
 
1.3%
the 2877
 
1.3%
Other values (18932) 177871
78.6%
2025-01-08T17:49:53.134747image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
206207
 
13.2%
e 125859
 
8.1%
a 97880
 
6.3%
i 90227
 
5.8%
t 77282
 
5.0%
n 75398
 
4.8%
o 74864
 
4.8%
r 73280
 
4.7%
l 65793
 
4.2%
s 59459
 
3.8%
Other values (113) 614202
39.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1046064
67.0%
Space Separator 206207
 
13.2%
Uppercase Letter 132849
 
8.5%
Other Punctuation 83260
 
5.3%
Decimal Number 69248
 
4.4%
Dash Punctuation 8074
 
0.5%
Open Punctuation 7113
 
0.5%
Close Punctuation 7104
 
0.5%
Connector Punctuation 195
 
< 0.1%
Math Symbol 153
 
< 0.1%
Other values (5) 184
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 125859
12.0%
a 97880
 
9.4%
i 90227
 
8.6%
t 77282
 
7.4%
n 75398
 
7.2%
o 74864
 
7.2%
r 73280
 
7.0%
l 65793
 
6.3%
s 59459
 
5.7%
c 43851
 
4.2%
Other values (32) 262171
25.1%
Uppercase Letter
ValueCountFrequency (%)
S 13820
 
10.4%
T 12273
 
9.2%
C 11400
 
8.6%
A 10232
 
7.7%
B 8549
 
6.4%
P 6979
 
5.3%
F 6440
 
4.8%
R 6027
 
4.5%
H 5879
 
4.4%
M 5726
 
4.3%
Other values (18) 45524
34.3%
Other Punctuation
ValueCountFrequency (%)
. 36042
43.3%
, 24281
29.2%
; 8664
 
10.4%
: 5897
 
7.1%
" 4572
 
5.5%
& 1629
 
2.0%
' 1020
 
1.2%
/ 444
 
0.5%
? 313
 
0.4%
# 227
 
0.3%
Other values (7) 171
 
0.2%
Decimal Number
ValueCountFrequency (%)
1 15281
22.1%
9 9550
13.8%
2 7514
10.9%
0 7154
10.3%
3 5592
 
8.1%
8 5486
 
7.9%
4 4963
 
7.2%
7 4813
 
7.0%
5 4717
 
6.8%
6 4178
 
6.0%
Math Symbol
ValueCountFrequency (%)
= 100
65.4%
+ 29
 
19.0%
× 13
 
8.5%
~ 4
 
2.6%
> 3
 
2.0%
< 3
 
2.0%
| 1
 
0.7%
Dash Punctuation
ValueCountFrequency (%)
- 7974
98.8%
98
 
1.2%
2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 6683
94.0%
[ 429
 
6.0%
{ 1
 
< 0.1%
Nonspacing Mark
ValueCountFrequency (%)
́ 48
60.0%
̀ 16
 
20.0%
̧ 16
 
20.0%
Other Symbol
ValueCountFrequency (%)
° 21
75.0%
© 5
 
17.9%
2
 
7.1%
Close Punctuation
ValueCountFrequency (%)
) 6679
94.0%
] 425
 
6.0%
Space Separator
ValueCountFrequency (%)
206207
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 195
100.0%
Final Punctuation
ValueCountFrequency (%)
38
100.0%
Initial Punctuation
ValueCountFrequency (%)
35
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1178913
75.5%
Common 381458
 
24.4%
Inherited 80
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 125859
 
10.7%
a 97880
 
8.3%
i 90227
 
7.7%
t 77282
 
6.6%
n 75398
 
6.4%
o 74864
 
6.4%
r 73280
 
6.2%
l 65793
 
5.6%
s 59459
 
5.0%
c 43851
 
3.7%
Other values (60) 395020
33.5%
Common
ValueCountFrequency (%)
206207
54.1%
. 36042
 
9.4%
, 24281
 
6.4%
1 15281
 
4.0%
9 9550
 
2.5%
; 8664
 
2.3%
- 7974
 
2.1%
2 7514
 
2.0%
0 7154
 
1.9%
( 6683
 
1.8%
Other values (40) 52108
 
13.7%
Inherited
ValueCountFrequency (%)
́ 48
60.0%
̀ 16
 
20.0%
̧ 16
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1559562
99.9%
None 622
 
< 0.1%
Punctuation 185
 
< 0.1%
Diacriticals 80
 
< 0.1%
Misc Symbols 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
206207
 
13.2%
e 125859
 
8.1%
a 97880
 
6.3%
i 90227
 
5.8%
t 77282
 
5.0%
n 75398
 
4.8%
o 74864
 
4.8%
r 73280
 
4.7%
l 65793
 
4.2%
s 59459
 
3.8%
Other values (80) 613313
39.3%
None
ValueCountFrequency (%)
í 184
29.6%
é 133
21.4%
ñ 79
12.7%
á 53
 
8.5%
è 26
 
4.2%
ç 25
 
4.0%
ó 22
 
3.5%
° 21
 
3.4%
ü 16
 
2.6%
× 13
 
2.1%
Other values (13) 50
 
8.0%
Punctuation
ValueCountFrequency (%)
98
53.0%
38
 
20.5%
35
 
18.9%
8
 
4.3%
4
 
2.2%
2
 
1.1%
Diacriticals
ValueCountFrequency (%)
́ 48
60.0%
̀ 16
 
20.0%
̧ 16
 
20.0%
Misc Symbols
ValueCountFrequency (%)
2
100.0%

fieldNumber
Text

Missing 

Distinct5
Distinct (%)8.5%
Missing988343
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:49:53.190343image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length9
Mean length9.322033898
Min length9

Characters and Unicode

Total characters550
Distinct characters32
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)6.8%

Sample

1st rowSample OY
2nd rowSample OY
3rd rowSample OY
4th rowSample OY
5th rowSample OY
ValueCountFrequency (%)
sample 55
45.8%
oy 55
45.8%
a 2
 
1.7%
u.s 1
 
0.8%
virgin 1
 
0.8%
islands 1
 
0.8%
alakai_220 1
 
0.8%
koolau_784 1
 
0.8%
koolau 1
 
0.8%
850 1
 
0.8%
2025-01-08T17:49:53.285346image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 62
11.3%
61
11.1%
l 59
10.7%
S 56
10.2%
m 55
10.0%
p 55
10.0%
e 55
10.0%
O 55
10.0%
Y 55
10.0%
o 4
 
0.7%
Other values (22) 33
6.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 304
55.3%
Uppercase Letter 172
31.3%
Space Separator 61
 
11.1%
Decimal Number 9
 
1.6%
Connector Punctuation 2
 
0.4%
Other Punctuation 2
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 62
20.4%
l 59
19.4%
m 55
18.1%
p 55
18.1%
e 55
18.1%
o 4
 
1.3%
i 3
 
1.0%
u 2
 
0.7%
s 2
 
0.7%
n 2
 
0.7%
Other values (5) 5
 
1.6%
Uppercase Letter
ValueCountFrequency (%)
S 56
32.6%
O 55
32.0%
Y 55
32.0%
K 2
 
1.2%
I 1
 
0.6%
A 1
 
0.6%
V 1
 
0.6%
U 1
 
0.6%
Decimal Number
ValueCountFrequency (%)
2 2
22.2%
8 2
22.2%
0 2
22.2%
7 1
11.1%
5 1
11.1%
4 1
11.1%
Space Separator
ValueCountFrequency (%)
61
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 476
86.5%
Common 74
 
13.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 62
13.0%
l 59
12.4%
S 56
11.8%
m 55
11.6%
p 55
11.6%
e 55
11.6%
O 55
11.6%
Y 55
11.6%
o 4
 
0.8%
i 3
 
0.6%
Other values (13) 17
 
3.6%
Common
ValueCountFrequency (%)
61
82.4%
_ 2
 
2.7%
2 2
 
2.7%
8 2
 
2.7%
0 2
 
2.7%
. 2
 
2.7%
7 1
 
1.4%
5 1
 
1.4%
4 1
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 550
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 62
11.3%
61
11.1%
l 59
10.7%
S 56
10.2%
m 55
10.0%
p 55
10.0%
e 55
10.0%
O 55
10.0%
Y 55
10.0%
o 4
 
0.7%
Other values (22) 33
6.0%

eventDate
Text

Missing 

Distinct66958
Distinct (%)7.7%
Missing119809
Missing (%)12.1%
Memory size7.5 MiB
2025-01-08T17:49:53.474261image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length10
Mean length10.01351151
Min length4

Characters and Unicode

Total characters8697666
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11342 ?
Unique (%)1.3%

Sample

1st row1981-04-30
2nd row1954-08-07
3rd row1947-04-03
4th row1966-04-01
5th row1971-03-23
ValueCountFrequency (%)
1891 1085
 
0.1%
1923 918
 
0.1%
1922 844
 
0.1%
1889 814
 
0.1%
1885 814
 
0.1%
1892 772
 
0.1%
1890 762
 
0.1%
1897 759
 
0.1%
1880 756
 
0.1%
1875 745
 
0.1%
Other values (66948) 860324
99.0%
2025-01-08T17:49:53.735224image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 1652713
19.0%
1 1649641
19.0%
0 1325270
15.2%
9 1106125
12.7%
2 658306
 
7.6%
8 502398
 
5.8%
7 370750
 
4.3%
6 370722
 
4.3%
3 354814
 
4.1%
5 329527
 
3.8%
Other values (2) 377400
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6991922
80.4%
Dash Punctuation 1652713
 
19.0%
Other Punctuation 53031
 
0.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1649641
23.6%
0 1325270
19.0%
9 1106125
15.8%
2 658306
 
9.4%
8 502398
 
7.2%
7 370750
 
5.3%
6 370722
 
5.3%
3 354814
 
5.1%
5 329527
 
4.7%
4 324369
 
4.6%
Dash Punctuation
ValueCountFrequency (%)
- 1652713
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 53031
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8697666
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 1652713
19.0%
1 1649641
19.0%
0 1325270
15.2%
9 1106125
12.7%
2 658306
 
7.6%
8 502398
 
5.8%
7 370750
 
4.3%
6 370722
 
4.3%
3 354814
 
4.1%
5 329527
 
3.8%
Other values (2) 377400
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8697666
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 1652713
19.0%
1 1649641
19.0%
0 1325270
15.2%
9 1106125
12.7%
2 658306
 
7.6%
8 502398
 
5.8%
7 370750
 
4.3%
6 370722
 
4.3%
3 354814
 
4.1%
5 329527
 
3.8%
Other values (2) 377400
 
4.3%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing261666
Missing (%)26.5%
Memory size7.5 MiB
2025-01-08T17:49:53.938016image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.775570496
Min length1

Characters and Unicode

Total characters2017107
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row120
2nd row219
3rd row93
4th row91
5th row82
ValueCountFrequency (%)
201 3860
 
0.5%
200 3710
 
0.5%
196 3699
 
0.5%
210 3653
 
0.5%
199 3644
 
0.5%
206 3635
 
0.5%
209 3596
 
0.5%
208 3571
 
0.5%
197 3518
 
0.5%
205 3509
 
0.5%
Other values (356) 690341
95.0%
2025-01-08T17:49:54.192351image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 402055
19.9%
1 395875
19.6%
3 232241
11.5%
5 148144
 
7.3%
4 147792
 
7.3%
0 141087
 
7.0%
6 140313
 
7.0%
9 139529
 
6.9%
8 135484
 
6.7%
7 134587
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2017107
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 402055
19.9%
1 395875
19.6%
3 232241
11.5%
5 148144
 
7.3%
4 147792
 
7.3%
0 141087
 
7.0%
6 140313
 
7.0%
9 139529
 
6.9%
8 135484
 
6.7%
7 134587
 
6.7%

Most occurring scripts

ValueCountFrequency (%)
Common 2017107
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 402055
19.9%
1 395875
19.6%
3 232241
11.5%
5 148144
 
7.3%
4 147792
 
7.3%
0 141087
 
7.0%
6 140313
 
7.0%
9 139529
 
6.9%
8 135484
 
6.7%
7 134587
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2017107
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 402055
19.9%
1 395875
19.6%
3 232241
11.5%
5 148144
 
7.3%
4 147792
 
7.3%
0 141087
 
7.0%
6 140313
 
7.0%
9 139529
 
6.9%
8 135484
 
6.7%
7 134587
 
6.7%

endDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing261666
Missing (%)26.5%
Memory size7.5 MiB
2025-01-08T17:49:54.387338image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.776592876
Min length1

Characters and Unicode

Total characters2017850
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row120
2nd row219
3rd row93
4th row91
5th row82
ValueCountFrequency (%)
201 3878
 
0.5%
200 3781
 
0.5%
210 3758
 
0.5%
199 3668
 
0.5%
206 3643
 
0.5%
196 3642
 
0.5%
209 3624
 
0.5%
208 3616
 
0.5%
197 3589
 
0.5%
205 3564
 
0.5%
Other values (356) 689973
94.9%
2025-01-08T17:49:54.647295image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 403054
20.0%
1 395145
19.6%
3 232842
11.5%
5 148228
 
7.3%
4 147942
 
7.3%
0 141306
 
7.0%
6 139307
 
6.9%
9 138440
 
6.9%
8 136018
 
6.7%
7 135568
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2017850
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 403054
20.0%
1 395145
19.6%
3 232842
11.5%
5 148228
 
7.3%
4 147942
 
7.3%
0 141306
 
7.0%
6 139307
 
6.9%
9 138440
 
6.9%
8 136018
 
6.7%
7 135568
 
6.7%

Most occurring scripts

ValueCountFrequency (%)
Common 2017850
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 403054
20.0%
1 395145
19.6%
3 232842
11.5%
5 148228
 
7.3%
4 147942
 
7.3%
0 141306
 
7.0%
6 139307
 
6.9%
9 138440
 
6.9%
8 136018
 
6.7%
7 135568
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2017850
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 403054
20.0%
1 395145
19.6%
3 232842
11.5%
5 148228
 
7.3%
4 147942
 
7.3%
0 141306
 
7.0%
6 139307
 
6.9%
9 138440
 
6.9%
8 136018
 
6.7%
7 135568
 
6.7%

year
Text

Missing 

Distinct250
Distinct (%)< 0.1%
Missing122319
Missing (%)12.4%
Memory size7.5 MiB
2025-01-08T17:49:54.843860image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters3464332
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)< 0.1%

Sample

1st row1981
2nd row1954
3rd row1947
4th row1966
5th row1971
ValueCountFrequency (%)
1966 11485
 
1.3%
1964 11177
 
1.3%
1939 10631
 
1.2%
1929 9967
 
1.2%
1949 9934
 
1.1%
1938 9757
 
1.1%
1965 9721
 
1.1%
1962 9422
 
1.1%
1922 9238
 
1.1%
1968 9163
 
1.1%
Other values (240) 765588
88.4%
2025-01-08T17:49:55.085143image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 985189
28.4%
9 894393
25.8%
8 298773
 
8.6%
0 225639
 
6.5%
2 208677
 
6.0%
6 191651
 
5.5%
4 170693
 
4.9%
3 166238
 
4.8%
7 162371
 
4.7%
5 160708
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3464332
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 985189
28.4%
9 894393
25.8%
8 298773
 
8.6%
0 225639
 
6.5%
2 208677
 
6.0%
6 191651
 
5.5%
4 170693
 
4.9%
3 166238
 
4.8%
7 162371
 
4.7%
5 160708
 
4.6%

Most occurring scripts

ValueCountFrequency (%)
Common 3464332
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 985189
28.4%
9 894393
25.8%
8 298773
 
8.6%
0 225639
 
6.5%
2 208677
 
6.0%
6 191651
 
5.5%
4 170693
 
4.9%
3 166238
 
4.8%
7 162371
 
4.7%
5 160708
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3464332
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 985189
28.4%
9 894393
25.8%
8 298773
 
8.6%
0 225639
 
6.5%
2 208677
 
6.0%
6 191651
 
5.5%
4 170693
 
4.9%
3 166238
 
4.8%
7 162371
 
4.7%
5 160708
 
4.6%

month
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing181983
Missing (%)18.4%
Memory size7.5 MiB
2025-01-08T17:49:55.142999image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.17061354
Min length1

Characters and Unicode

Total characters944005
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row8
3rd row4
4th row4
5th row3
ValueCountFrequency (%)
7 116222
14.4%
8 105467
13.1%
6 87311
10.8%
5 73413
9.1%
9 72707
9.0%
4 61870
7.7%
3 56406
7.0%
10 54634
6.8%
2 49486
6.1%
1 45951
 
5.7%
Other values (2) 82952
10.3%
2025-01-08T17:49:55.241829image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 227383
24.1%
7 116222
12.3%
8 105467
11.2%
2 88592
 
9.4%
6 87311
 
9.2%
5 73413
 
7.8%
9 72707
 
7.7%
4 61870
 
6.6%
3 56406
 
6.0%
0 54634
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 944005
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 227383
24.1%
7 116222
12.3%
8 105467
11.2%
2 88592
 
9.4%
6 87311
 
9.2%
5 73413
 
7.8%
9 72707
 
7.7%
4 61870
 
6.6%
3 56406
 
6.0%
0 54634
 
5.8%

Most occurring scripts

ValueCountFrequency (%)
Common 944005
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 227383
24.1%
7 116222
12.3%
8 105467
11.2%
2 88592
 
9.4%
6 87311
 
9.2%
5 73413
 
7.8%
9 72707
 
7.7%
4 61870
 
6.6%
3 56406
 
6.0%
0 54634
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 944005
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 227383
24.1%
7 116222
12.3%
8 105467
11.2%
2 88592
 
9.4%
6 87311
 
9.2%
5 73413
 
7.8%
9 72707
 
7.7%
4 61870
 
6.6%
3 56406
 
6.0%
0 54634
 
5.8%

day
Text

Missing 

Distinct31
Distinct (%)< 0.1%
Missing314697
Missing (%)31.8%
Memory size7.5 MiB
2025-01-08T17:49:55.311337image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.713809457
Min length1

Characters and Unicode

Total characters1154602
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row30
2nd row7
3rd row3
4th row1
5th row23
ValueCountFrequency (%)
20 24963
 
3.7%
15 24514
 
3.6%
18 23599
 
3.5%
10 23434
 
3.5%
19 22891
 
3.4%
25 22886
 
3.4%
17 22629
 
3.4%
23 22542
 
3.3%
24 22331
 
3.3%
21 22292
 
3.3%
Other values (21) 441624
65.6%
2025-01-08T17:49:55.431431image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 303094
26.3%
2 288122
25.0%
3 97141
 
8.4%
5 69071
 
6.0%
0 68565
 
5.9%
8 67525
 
5.8%
7 66156
 
5.7%
4 65473
 
5.7%
6 65354
 
5.7%
9 64101
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1154602
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 303094
26.3%
2 288122
25.0%
3 97141
 
8.4%
5 69071
 
6.0%
0 68565
 
5.9%
8 67525
 
5.8%
7 66156
 
5.7%
4 65473
 
5.7%
6 65354
 
5.7%
9 64101
 
5.6%

Most occurring scripts

ValueCountFrequency (%)
Common 1154602
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 303094
26.3%
2 288122
25.0%
3 97141
 
8.4%
5 69071
 
6.0%
0 68565
 
5.9%
8 67525
 
5.8%
7 66156
 
5.7%
4 65473
 
5.7%
6 65354
 
5.7%
9 64101
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1154602
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 303094
26.3%
2 288122
25.0%
3 97141
 
8.4%
5 69071
 
6.0%
0 68565
 
5.9%
8 67525
 
5.8%
7 66156
 
5.7%
4 65473
 
5.7%
6 65354
 
5.7%
9 64101
 
5.6%

verbatimEventDate
Text

Missing 

Distinct83121
Distinct (%)25.0%
Missing655426
Missing (%)66.3%
Memory size7.5 MiB
2025-01-08T17:49:55.607211image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length69610
Median length11
Mean length13.58990137
Min length1

Characters and Unicode

Total characters4525111
Distinct characters101
Distinct categories14 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique34904 ?
Unique (%)10.5%

Sample

1st row30 Apr 1981
2nd row16 Dec 1953
3rd row-- --- ----
4th row01 Feb 1974
5th rowTranscribed d/m/y: 28/4/76
ValueCountFrequency (%)
124747
 
12.0%
transcribed 35815
 
3.5%
d/m/y 35815
 
3.5%
jul 29191
 
2.8%
aug 27927
 
2.7%
may 22223
 
2.1%
sep 22121
 
2.1%
jun 22087
 
2.1%
to 19593
 
1.9%
apr 19397
 
1.9%
Other values (27964) 676941
65.4%
2025-01-08T17:49:55.865057image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
698931
 
15.4%
1 452053
 
10.0%
- 374626
 
8.3%
9 327456
 
7.2%
2 199891
 
4.4%
0 167707
 
3.7%
8 147074
 
3.3%
/ 146437
 
3.2%
r 129266
 
2.9%
e 109856
 
2.4%
Other values (91) 1771814
39.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1788765
39.5%
Lowercase Letter 1136371
25.1%
Space Separator 698931
 
15.4%
Dash Punctuation 374626
 
8.3%
Uppercase Letter 323132
 
7.1%
Other Punctuation 189357
 
4.2%
Control 12953
 
0.3%
Connector Punctuation 644
 
< 0.1%
Open Punctuation 162
 
< 0.1%
Close Punctuation 162
 
< 0.1%
Other values (4) 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 129266
11.4%
e 109856
 
9.7%
a 108149
 
9.5%
u 93732
 
8.2%
n 82604
 
7.3%
d 73958
 
6.5%
c 72996
 
6.4%
y 64680
 
5.7%
b 61967
 
5.5%
p 49005
 
4.3%
Other values (27) 290158
25.5%
Uppercase Letter
ValueCountFrequency (%)
J 76067
23.5%
A 54199
16.8%
M 46856
14.5%
T 36885
11.4%
S 26921
 
8.3%
F 21380
 
6.6%
O 20504
 
6.3%
N 17433
 
5.4%
D 14938
 
4.6%
E 1440
 
0.4%
Other values (19) 6509
 
2.0%
Other Punctuation
ValueCountFrequency (%)
/ 146437
77.3%
: 36756
 
19.4%
, 2937
 
1.6%
. 2690
 
1.4%
' 164
 
0.1%
? 142
 
0.1%
! 112
 
0.1%
; 57
 
< 0.1%
& 31
 
< 0.1%
" 16
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 452053
25.3%
9 327456
18.3%
2 199891
11.2%
0 167707
 
9.4%
8 147074
 
8.2%
6 107431
 
6.0%
3 104037
 
5.8%
4 95148
 
5.3%
7 94134
 
5.3%
5 93834
 
5.2%
Control
ValueCountFrequency (%)
12895
99.6%
58
 
0.4%
Open Punctuation
ValueCountFrequency (%)
( 102
63.0%
[ 60
37.0%
Close Punctuation
ValueCountFrequency (%)
) 102
63.0%
] 60
37.0%
Math Symbol
ValueCountFrequency (%)
= 3
60.0%
× 2
40.0%
Space Separator
ValueCountFrequency (%)
698931
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 374626
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 644
100.0%
Other Symbol
ValueCountFrequency (%)
° 1
100.0%
Other Number
ValueCountFrequency (%)
½ 1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3065608
67.7%
Latin 1459503
32.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 129266
 
8.9%
e 109856
 
7.5%
a 108149
 
7.4%
u 93732
 
6.4%
n 82604
 
5.7%
J 76067
 
5.2%
d 73958
 
5.1%
c 72996
 
5.0%
y 64680
 
4.4%
b 61967
 
4.2%
Other values (56) 586228
40.2%
Common
ValueCountFrequency (%)
698931
22.8%
1 452053
14.7%
- 374626
12.2%
9 327456
10.7%
2 199891
 
6.5%
0 167707
 
5.5%
8 147074
 
4.8%
/ 146437
 
4.8%
6 107431
 
3.5%
3 104037
 
3.4%
Other values (25) 339965
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4525057
> 99.9%
None 53
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
698931
 
15.4%
1 452053
 
10.0%
- 374626
 
8.3%
9 327456
 
7.2%
2 199891
 
4.4%
0 167707
 
3.7%
8 147074
 
3.3%
/ 146437
 
3.2%
r 129266
 
2.9%
e 109856
 
2.4%
Other values (73) 1771760
39.2%
None
ValueCountFrequency (%)
é 15
28.3%
í 6
 
11.3%
ó 6
 
11.3%
á 5
 
9.4%
ô 3
 
5.7%
û 3
 
5.7%
Æ 3
 
5.7%
ü 2
 
3.8%
× 2
 
3.8%
° 1
 
1.9%
Other values (7) 7
13.2%
Punctuation
ValueCountFrequency (%)
1
100.0%

habitat
Text

Missing 

Distinct54569
Distinct (%)49.4%
Missing877971
Missing (%)88.8%
Memory size7.5 MiB
2025-01-08T17:49:56.052900image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length567
Median length292
Mean length33.5132979
Min length1

Characters and Unicode

Total characters3700907
Distinct characters129
Distinct categories16 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique43682 ?
Unique (%)39.6%

Sample

1st rowErect.
2nd rowPlanted
3rd rowHillsides covered with broad-leaved forest, understory with Arthrostylidium, Rubus, and numerous ferns, epiphytes and Usnea.
4th rowOpen to closed forest with Pinus contorta, Populus tremuloides, Purshia tridentata, and Ribes cereum.
5th rowDeep secondary forest; clay soil
ValueCountFrequency (%)
forest 28539
 
5.0%
on 19755
 
3.5%
and 16182
 
2.8%
in 14715
 
2.6%
with 11801
 
2.1%
of 10905
 
1.9%
along 6428
 
1.1%
de 6077
 
1.1%
soil 5416
 
1.0%
sand 4830
 
0.8%
Other values (19599) 444497
78.1%
2025-01-08T17:49:56.306348image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
458714
12.4%
e 334644
 
9.0%
a 293407
 
7.9%
o 267034
 
7.2%
r 232740
 
6.3%
s 232487
 
6.3%
n 229134
 
6.2%
i 195932
 
5.3%
t 185312
 
5.0%
l 144970
 
3.9%
Other values (119) 1126533
30.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2956785
79.9%
Space Separator 458714
 
12.4%
Uppercase Letter 161024
 
4.4%
Other Punctuation 101816
 
2.8%
Decimal Number 9235
 
0.2%
Dash Punctuation 8003
 
0.2%
Close Punctuation 2295
 
0.1%
Open Punctuation 2276
 
0.1%
Math Symbol 699
 
< 0.1%
Other Symbol 30
 
< 0.1%
Other values (6) 30
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 334644
11.3%
a 293407
9.9%
o 267034
 
9.0%
r 232740
 
7.9%
s 232487
 
7.9%
n 229134
 
7.7%
i 195932
 
6.6%
t 185312
 
6.3%
l 144970
 
4.9%
d 142531
 
4.8%
Other values (41) 698594
23.6%
Uppercase Letter
ValueCountFrequency (%)
S 19589
12.2%
M 14835
 
9.2%
C 11216
 
7.0%
P 10463
 
6.5%
O 10200
 
6.3%
A 9926
 
6.2%
R 9817
 
6.1%
D 9128
 
5.7%
B 8873
 
5.5%
F 8509
 
5.3%
Other values (19) 48468
30.1%
Other Punctuation
ValueCountFrequency (%)
, 45008
44.2%
. 43899
43.1%
; 6228
 
6.1%
& 2296
 
2.3%
: 1271
 
1.2%
/ 1232
 
1.2%
" 931
 
0.9%
' 519
 
0.5%
% 153
 
0.2%
? 145
 
0.1%
Other values (6) 134
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 2807
30.4%
3 1186
12.8%
1 1155
12.5%
5 1136
12.3%
2 1092
 
11.8%
4 740
 
8.0%
6 387
 
4.2%
8 304
 
3.3%
7 223
 
2.4%
9 205
 
2.2%
Math Symbol
ValueCountFrequency (%)
~ 423
60.5%
| 121
 
17.3%
+ 82
 
11.7%
± 50
 
7.2%
= 14
 
2.0%
> 6
 
0.9%
< 3
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 2189
95.4%
] 69
 
3.0%
} 37
 
1.6%
Open Punctuation
ValueCountFrequency (%)
( 2176
95.6%
[ 63
 
2.8%
{ 37
 
1.6%
Dash Punctuation
ValueCountFrequency (%)
- 7994
99.9%
9
 
0.1%
Space Separator
ValueCountFrequency (%)
458714
100.0%
Other Symbol
ValueCountFrequency (%)
° 30
100.0%
Final Punctuation
ValueCountFrequency (%)
10
100.0%
Initial Punctuation
ValueCountFrequency (%)
7
100.0%
Other Letter
ValueCountFrequency (%)
º 5
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 5
100.0%
Other Number
ValueCountFrequency (%)
² 2
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3117814
84.2%
Common 583093
 
15.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 334644
 
10.7%
a 293407
 
9.4%
o 267034
 
8.6%
r 232740
 
7.5%
s 232487
 
7.5%
n 229134
 
7.3%
i 195932
 
6.3%
t 185312
 
5.9%
l 144970
 
4.6%
d 142531
 
4.6%
Other values (71) 859623
27.6%
Common
ValueCountFrequency (%)
458714
78.7%
, 45008
 
7.7%
. 43899
 
7.5%
- 7994
 
1.4%
; 6228
 
1.1%
0 2807
 
0.5%
& 2296
 
0.4%
) 2189
 
0.4%
( 2176
 
0.4%
: 1271
 
0.2%
Other values (38) 10511
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3694009
99.8%
None 6846
 
0.2%
Punctuation 52
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
458714
12.4%
e 334644
 
9.1%
a 293407
 
7.9%
o 267034
 
7.2%
r 232740
 
6.3%
s 232487
 
6.3%
n 229134
 
6.2%
i 195932
 
5.3%
t 185312
 
5.0%
l 144970
 
3.9%
Other values (82) 1119635
30.3%
None
ValueCountFrequency (%)
ú 1030
15.0%
ê 1022
14.9%
é 989
14.4%
ó 939
13.7%
í 817
11.9%
á 696
10.2%
ñ 546
8.0%
è 322
 
4.7%
à 132
 
1.9%
ã 56
 
0.8%
Other values (23) 297
 
4.3%
Punctuation
ValueCountFrequency (%)
26
50.0%
10
 
19.2%
9
 
17.3%
7
 
13.5%

locationID
Text

Missing 

Distinct667
Distinct (%)7.4%
Missing979422
Missing (%)99.1%
Memory size7.5 MiB
2025-01-08T17:49:56.486322image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length5
Mean length6.010690423
Min length1

Characters and Unicode

Total characters53976
Distinct characters65
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique202 ?
Unique (%)2.2%

Sample

1st row66-10
2nd row69-11
3rd row64-51
4th row66-14
5th row64-34
ValueCountFrequency (%)
station 1070
 
10.3%
ms04 374
 
3.6%
66-24 305
 
2.9%
61 200
 
1.9%
64-47 131
 
1.3%
64-48 130
 
1.3%
69-14 124
 
1.2%
64-45 98
 
0.9%
66-28 92
 
0.9%
64-06 90
 
0.9%
Other values (654) 7783
74.9%
2025-01-08T17:49:56.736843image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 9454
17.5%
- 7849
14.5%
4 4641
 
8.6%
2 4263
 
7.9%
1 3970
 
7.4%
0 3323
 
6.2%
3 2445
 
4.5%
7 2280
 
4.2%
t 2194
 
4.1%
S 1651
 
3.1%
Other values (55) 11906
22.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 34512
63.9%
Dash Punctuation 7849
 
14.5%
Lowercase Letter 6872
 
12.7%
Uppercase Letter 3196
 
5.9%
Space Separator 1417
 
2.6%
Connector Punctuation 69
 
0.1%
Close Punctuation 26
 
< 0.1%
Open Punctuation 26
 
< 0.1%
Other Punctuation 9
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 1651
51.7%
M 410
 
12.8%
A 247
 
7.7%
I 170
 
5.3%
K 150
 
4.7%
N 107
 
3.3%
T 74
 
2.3%
H 62
 
1.9%
O 51
 
1.6%
B 40
 
1.3%
Other values (15) 234
 
7.3%
Lowercase Letter
ValueCountFrequency (%)
t 2194
31.9%
n 1126
16.4%
o 1120
16.3%
i 1107
16.1%
a 1103
16.1%
e 56
 
0.8%
r 27
 
0.4%
l 25
 
0.4%
s 21
 
0.3%
d 20
 
0.3%
Other values (10) 73
 
1.1%
Decimal Number
ValueCountFrequency (%)
6 9454
27.4%
4 4641
13.4%
2 4263
12.4%
1 3970
11.5%
0 3323
 
9.6%
3 2445
 
7.1%
7 2280
 
6.6%
8 1650
 
4.8%
9 1276
 
3.7%
5 1210
 
3.5%
Other Punctuation
ValueCountFrequency (%)
, 6
66.7%
/ 2
 
22.2%
& 1
 
11.1%
Close Punctuation
ValueCountFrequency (%)
) 25
96.2%
] 1
 
3.8%
Open Punctuation
ValueCountFrequency (%)
( 25
96.2%
[ 1
 
3.8%
Dash Punctuation
ValueCountFrequency (%)
- 7849
100.0%
Space Separator
ValueCountFrequency (%)
1417
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 69
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 43908
81.3%
Latin 10068
 
18.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 2194
21.8%
S 1651
16.4%
n 1126
11.2%
o 1120
11.1%
i 1107
11.0%
a 1103
11.0%
M 410
 
4.1%
A 247
 
2.5%
I 170
 
1.7%
K 150
 
1.5%
Other values (35) 790
 
7.8%
Common
ValueCountFrequency (%)
6 9454
21.5%
- 7849
17.9%
4 4641
10.6%
2 4263
9.7%
1 3970
9.0%
0 3323
 
7.6%
3 2445
 
5.6%
7 2280
 
5.2%
8 1650
 
3.8%
1417
 
3.2%
Other values (10) 2616
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 53976
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 9454
17.5%
- 7849
14.5%
4 4641
 
8.6%
2 4263
 
7.9%
1 3970
 
7.4%
0 3323
 
6.2%
3 2445
 
4.5%
7 2280
 
4.2%
t 2194
 
4.1%
S 1651
 
3.1%
Other values (55) 11906
22.1%
Distinct17498
Distinct (%)1.8%
Missing8448
Missing (%)0.9%
Memory size7.5 MiB
2025-01-08T17:49:56.919978image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length133
Median length113
Mean length40.94448923
Min length5

Characters and Unicode

Total characters40123716
Distinct characters134
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6237 ?
Unique (%)0.6%

Sample

1st rowNorth America, United States, Florida
2nd rowSouth America - Neotropics, Peru, Piura
3rd rowSouth America, Argentina, Formosa
4th rowSouth America - Neotropics, Venezuela, Carabobo
5th rowAfrica, South Africa
ValueCountFrequency (%)
america 664608
 
12.5%
north 382460
 
7.2%
365184
 
6.8%
neotropics 351203
 
6.6%
united 295755
 
5.5%
states 293830
 
5.5%
south 254482
 
4.8%
mexico 71903
 
1.3%
asia-tropical 66600
 
1.2%
brazil 65997
 
1.2%
Other values (10459) 2522178
47.3%
2025-01-08T17:49:57.173406image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4354246
 
10.9%
a 3701367
 
9.2%
i 2990303
 
7.5%
e 2937990
 
7.3%
r 2542856
 
6.3%
t 2505817
 
6.2%
o 2445015
 
6.1%
, 2049394
 
5.1%
n 1579289
 
3.9%
c 1556529
 
3.9%
Other values (124) 13460910
33.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 28096131
70.0%
Uppercase Letter 5068430
 
12.6%
Space Separator 4354246
 
10.9%
Other Punctuation 2092855
 
5.2%
Dash Punctuation 493187
 
1.2%
Open Punctuation 9360
 
< 0.1%
Close Punctuation 9360
 
< 0.1%
Modifier Letter 104
 
< 0.1%
Modifier Symbol 25
 
< 0.1%
Decimal Number 17
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3701367
13.2%
i 2990303
10.6%
e 2937990
10.5%
r 2542856
9.1%
t 2505817
8.9%
o 2445015
8.7%
n 1579289
 
5.6%
c 1556529
 
5.5%
s 1514118
 
5.4%
m 951564
 
3.4%
Other values (58) 5371283
19.1%
Uppercase Letter
ValueCountFrequency (%)
A 996984
19.7%
N 827492
16.3%
S 695744
13.7%
C 398218
 
7.9%
U 330028
 
6.5%
M 216529
 
4.3%
P 200273
 
4.0%
I 183128
 
3.6%
T 183052
 
3.6%
B 152770
 
3.0%
Other values (33) 884212
17.4%
Other Punctuation
ValueCountFrequency (%)
, 2049394
97.9%
. 29432
 
1.4%
' 8898
 
0.4%
/ 4747
 
0.2%
? 346
 
< 0.1%
& 33
 
< 0.1%
; 2
 
< 0.1%
¡ 2
 
< 0.1%
\ 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 8
47.1%
2 7
41.2%
9 1
 
5.9%
6 1
 
5.9%
Open Punctuation
ValueCountFrequency (%)
( 5733
61.3%
[ 3627
38.8%
Close Punctuation
ValueCountFrequency (%)
) 5733
61.3%
] 3627
38.8%
Modifier Letter
ValueCountFrequency (%)
ʻ 91
87.5%
ʼ 13
 
12.5%
Space Separator
ValueCountFrequency (%)
4354246
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 493187
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 25
100.0%
Nonspacing Mark
ValueCountFrequency (%)
́ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 33164561
82.7%
Common 6959154
 
17.3%
Inherited 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3701367
 
11.2%
i 2990303
 
9.0%
e 2937990
 
8.9%
r 2542856
 
7.7%
t 2505817
 
7.6%
o 2445015
 
7.4%
n 1579289
 
4.8%
c 1556529
 
4.7%
s 1514118
 
4.6%
A 996984
 
3.0%
Other values (101) 10394293
31.3%
Common
ValueCountFrequency (%)
4354246
62.6%
, 2049394
29.4%
- 493187
 
7.1%
. 29432
 
0.4%
' 8898
 
0.1%
( 5733
 
0.1%
) 5733
 
0.1%
/ 4747
 
0.1%
[ 3627
 
0.1%
] 3627
 
0.1%
Other values (12) 530
 
< 0.1%
Inherited
ValueCountFrequency (%)
́ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 40018791
99.7%
None 104815
 
0.3%
Modifier Letters 104
 
< 0.1%
Latin Ext Additional 5
 
< 0.1%
Diacriticals 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4354246
 
10.9%
a 3701367
 
9.2%
i 2990303
 
7.5%
e 2937990
 
7.3%
r 2542856
 
6.4%
t 2505817
 
6.3%
o 2445015
 
6.1%
, 2049394
 
5.1%
n 1579289
 
3.9%
c 1556529
 
3.9%
Other values (60) 13355985
33.4%
None
ValueCountFrequency (%)
á 34764
33.2%
í 19904
19.0%
é 17687
16.9%
ó 12612
 
12.0%
ã 6382
 
6.1%
ô 3029
 
2.9%
ç 1904
 
1.8%
ñ 1606
 
1.5%
Î 1479
 
1.4%
ü 1115
 
1.1%
Other values (49) 4333
 
4.1%
Modifier Letters
ValueCountFrequency (%)
ʻ 91
87.5%
ʼ 13
 
12.5%
Latin Ext Additional
ValueCountFrequency (%)
4
80.0%
1
 
20.0%
Diacriticals
ValueCountFrequency (%)
́ 1
100.0%

continent
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing32788
Missing (%)3.3%
Memory size7.5 MiB
2025-01-08T17:49:57.230652image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length11.06743099
Min length4

Characters and Unicode

Total characters10576192
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSOUTH_AMERICA
2nd rowSOUTH_AMERICA
3rd rowSOUTH_AMERICA
4th rowAFRICA
5th rowSOUTH_AMERICA
ValueCountFrequency (%)
north_america 482446
50.5%
south_america 235730
24.7%
asia 113249
 
11.9%
europe 50324
 
5.3%
oceania 37414
 
3.9%
africa 35361
 
3.7%
antarctica 1090
 
0.1%
2025-01-08T17:49:57.328649image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 1811670
17.1%
R 1287397
12.2%
I 905290
8.6%
E 856238
8.1%
O 805914
7.6%
C 793131
7.5%
T 720356
 
6.8%
H 718176
 
6.8%
_ 718176
 
6.8%
M 718176
 
6.8%
Other values (5) 1241668
11.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 9858016
93.2%
Connector Punctuation 718176
 
6.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 1811670
18.4%
R 1287397
13.1%
I 905290
9.2%
E 856238
8.7%
O 805914
8.2%
C 793131
8.0%
T 720356
 
7.3%
H 718176
 
7.3%
M 718176
 
7.3%
N 520950
 
5.3%
Other values (4) 720718
 
7.3%
Connector Punctuation
ValueCountFrequency (%)
_ 718176
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9858016
93.2%
Common 718176
 
6.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1811670
18.4%
R 1287397
13.1%
I 905290
9.2%
E 856238
8.7%
O 805914
8.2%
C 793131
8.0%
T 720356
 
7.3%
H 718176
 
7.3%
M 718176
 
7.3%
N 520950
 
5.3%
Other values (4) 720718
 
7.3%
Common
ValueCountFrequency (%)
_ 718176
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10576192
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1811670
17.1%
R 1287397
12.2%
I 905290
8.6%
E 856238
8.1%
O 805914
7.6%
C 793131
7.5%
T 720356
 
6.8%
H 718176
 
6.8%
_ 718176
 
6.8%
M 718176
 
6.8%
Other values (5) 1241668
11.7%

waterBody
Text

Missing 

Distinct75
Distinct (%)1.8%
Missing984227
Missing (%)99.6%
Memory size7.5 MiB
2025-01-08T17:49:57.397652image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length62
Median length61
Mean length25.99209581
Min length8

Characters and Unicode

Total characters108517
Distinct characters52
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)0.6%

Sample

1st rowNorth Atlantic Ocean, Bay of Fundy
2nd rowNorth Atlantic Ocean, Caribbean Sea
3rd rowNorth Atlantic Ocean, Gulf of Maine, Englishman Bay/Mack Cove
4th rowNorth Atlantic Ocean, Caribbean Sea
5th rowNorth Pacific Ocean
ValueCountFrequency (%)
ocean 3353
20.0%
north 3226
19.2%
atlantic 3034
18.1%
sea 1523
9.1%
caribbean 1284
 
7.6%
of 757
 
4.5%
gulf 720
 
4.3%
maine 576
 
3.4%
bay 526
 
3.1%
pacific 275
 
1.6%
Other values (74) 1519
9.0%
2025-01-08T17:49:57.537057image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 12644
11.7%
12618
11.6%
t 9811
 
9.0%
n 8916
 
8.2%
e 7679
 
7.1%
c 7268
 
6.7%
i 5876
 
5.4%
r 4960
 
4.6%
o 4857
 
4.5%
l 4068
 
3.7%
Other values (42) 29820
27.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 77476
71.4%
Uppercase Letter 16118
 
14.9%
Space Separator 12618
 
11.6%
Other Punctuation 2214
 
2.0%
Modifier Letter 91
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 12644
16.3%
t 9811
12.7%
n 8916
11.5%
e 7679
9.9%
c 7268
9.4%
i 5876
7.6%
r 4960
 
6.4%
o 4857
 
6.3%
l 4068
 
5.3%
h 3689
 
4.8%
Other values (16) 7708
9.9%
Uppercase Letter
ValueCountFrequency (%)
O 3400
21.1%
N 3226
20.0%
A 3059
19.0%
S 1750
10.9%
C 1606
10.0%
G 802
 
5.0%
B 681
 
4.2%
M 657
 
4.1%
P 418
 
2.6%
I 123
 
0.8%
Other values (11) 396
 
2.5%
Other Punctuation
ValueCountFrequency (%)
, 2113
95.4%
/ 83
 
3.7%
' 18
 
0.8%
Space Separator
ValueCountFrequency (%)
12618
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 91
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 93594
86.2%
Common 14923
 
13.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 12644
13.5%
t 9811
10.5%
n 8916
 
9.5%
e 7679
 
8.2%
c 7268
 
7.8%
i 5876
 
6.3%
r 4960
 
5.3%
o 4857
 
5.2%
l 4068
 
4.3%
h 3689
 
3.9%
Other values (37) 23826
25.5%
Common
ValueCountFrequency (%)
12618
84.6%
, 2113
 
14.2%
ʻ 91
 
0.6%
/ 83
 
0.6%
' 18
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 108335
99.8%
Modifier Letters 91
 
0.1%
None 91
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 12644
11.7%
12618
11.6%
t 9811
 
9.1%
n 8916
 
8.2%
e 7679
 
7.1%
c 7268
 
6.7%
i 5876
 
5.4%
r 4960
 
4.6%
o 4857
 
4.5%
l 4068
 
3.8%
Other values (40) 29638
27.4%
Modifier Letters
ValueCountFrequency (%)
ʻ 91
100.0%
None
ValueCountFrequency (%)
ā 91
100.0%

islandGroup
Text

Missing 

Distinct362
Distinct (%)1.5%
Missing963568
Missing (%)97.5%
Memory size7.5 MiB
2025-01-08T17:49:57.710419image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length42
Median length39
Mean length14.85515825
Min length5

Characters and Unicode

Total characters368913
Distinct characters62
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique85 ?
Unique (%)0.3%

Sample

1st rowGreater Antilles
2nd rowGreater Antilles
3rd rowElizabeth Islands
4th rowChannel Islands
5th rowGreater Antilles
ValueCountFrequency (%)
greater 7095
 
12.5%
antilles 7095
 
12.5%
islands 5085
 
9.0%
is 4355
 
7.7%
group 3620
 
6.4%
new 1627
 
2.9%
guinea 1329
 
2.3%
keys 1172
 
2.1%
channel 1169
 
2.1%
florida 1110
 
2.0%
Other values (325) 23144
40.7%
2025-01-08T17:49:57.953728image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 37198
 
10.1%
a 33600
 
9.1%
31967
 
8.7%
s 29203
 
7.9%
l 28135
 
7.6%
r 26374
 
7.1%
n 24868
 
6.7%
t 19411
 
5.3%
i 18406
 
5.0%
G 13341
 
3.6%
Other values (52) 106410
28.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 275978
74.8%
Uppercase Letter 55169
 
15.0%
Space Separator 31967
 
8.7%
Other Punctuation 4509
 
1.2%
Open Punctuation 643
 
0.2%
Close Punctuation 643
 
0.2%
Dash Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 37198
13.5%
a 33600
12.2%
s 29203
10.6%
l 28135
10.2%
r 26374
9.6%
n 24868
9.0%
t 19411
7.0%
i 18406
6.7%
u 11144
 
4.0%
d 10677
 
3.9%
Other values (17) 36962
13.4%
Uppercase Letter
ValueCountFrequency (%)
G 13341
24.2%
I 10292
18.7%
A 8068
14.6%
C 3484
 
6.3%
V 3078
 
5.6%
L 2538
 
4.6%
N 2119
 
3.8%
S 1813
 
3.3%
K 1439
 
2.6%
F 1319
 
2.4%
Other values (15) 7678
13.9%
Other Punctuation
ValueCountFrequency (%)
. 4348
96.4%
' 155
 
3.4%
, 4
 
0.1%
? 2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 411
63.9%
( 232
36.1%
Close Punctuation
ValueCountFrequency (%)
] 411
63.9%
) 232
36.1%
Space Separator
ValueCountFrequency (%)
31967
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 331147
89.8%
Common 37766
 
10.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 37198
11.2%
a 33600
 
10.1%
s 29203
 
8.8%
l 28135
 
8.5%
r 26374
 
8.0%
n 24868
 
7.5%
t 19411
 
5.9%
i 18406
 
5.6%
G 13341
 
4.0%
u 11144
 
3.4%
Other values (42) 89467
27.0%
Common
ValueCountFrequency (%)
31967
84.6%
. 4348
 
11.5%
[ 411
 
1.1%
] 411
 
1.1%
( 232
 
0.6%
) 232
 
0.6%
' 155
 
0.4%
, 4
 
< 0.1%
- 4
 
< 0.1%
? 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 366915
99.5%
None 1998
 
0.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 37198
 
10.1%
a 33600
 
9.2%
31967
 
8.7%
s 29203
 
8.0%
l 28135
 
7.7%
r 26374
 
7.2%
n 24868
 
6.8%
t 19411
 
5.3%
i 18406
 
5.0%
G 13341
 
3.6%
Other values (50) 104412
28.5%
None
ValueCountFrequency (%)
Î 1085
54.3%
á 913
45.7%

island
Text

Missing 

Distinct2614
Distinct (%)3.2%
Missing906001
Missing (%)91.7%
Memory size7.5 MiB
2025-01-08T17:49:58.134560image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length48
Median length43
Mean length9.546267642
Min length2

Characters and Unicode

Total characters786622
Distinct characters76
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique943 ?
Unique (%)1.1%

Sample

1st rowRota
2nd rowHispaniola
3rd rowNorth Island
4th rowKaua'i
5th rowHispaniola Island
ValueCountFrequency (%)
hispaniola 10778
 
8.5%
island 9771
 
7.7%
cuba 4961
 
3.9%
oahu 3726
 
2.9%
st 2657
 
2.1%
kaua'i 2655
 
2.1%
new 2291
 
1.8%
jamaica 2258
 
1.8%
isla 2167
 
1.7%
luzon 2129
 
1.7%
Other values (2138) 83021
65.7%
2025-01-08T17:49:58.382761image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 125254
15.9%
i 61933
 
7.9%
n 52763
 
6.7%
o 47475
 
6.0%
44013
 
5.6%
l 41457
 
5.3%
u 38087
 
4.8%
e 37522
 
4.8%
s 35293
 
4.5%
r 27973
 
3.6%
Other values (66) 274852
34.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 602963
76.7%
Uppercase Letter 123048
 
15.6%
Space Separator 44013
 
5.6%
Other Punctuation 9305
 
1.2%
Open Punctuation 3492
 
0.4%
Close Punctuation 3492
 
0.4%
Dash Punctuation 309
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 125254
20.8%
i 61933
10.3%
n 52763
8.8%
o 47475
 
7.9%
l 41457
 
6.9%
u 38087
 
6.3%
e 37522
 
6.2%
s 35293
 
5.9%
r 27973
 
4.6%
t 23875
 
4.0%
Other values (28) 111331
18.5%
Uppercase Letter
ValueCountFrequency (%)
H 16106
13.1%
I 14801
12.0%
C 13357
 
10.9%
S 9515
 
7.7%
M 7530
 
6.1%
B 5675
 
4.6%
T 5620
 
4.6%
G 5558
 
4.5%
K 5090
 
4.1%
O 5019
 
4.1%
Other values (17) 34777
28.3%
Other Punctuation
ValueCountFrequency (%)
' 4593
49.4%
. 4379
47.1%
, 297
 
3.2%
? 32
 
0.3%
/ 4
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 2963
84.9%
( 529
 
15.1%
Close Punctuation
ValueCountFrequency (%)
] 2963
84.9%
) 529
 
15.1%
Space Separator
ValueCountFrequency (%)
44013
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 309
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 726011
92.3%
Common 60611
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 125254
17.3%
i 61933
 
8.5%
n 52763
 
7.3%
o 47475
 
6.5%
l 41457
 
5.7%
u 38087
 
5.2%
e 37522
 
5.2%
s 35293
 
4.9%
r 27973
 
3.9%
t 23875
 
3.3%
Other values (55) 234379
32.3%
Common
ValueCountFrequency (%)
44013
72.6%
' 4593
 
7.6%
. 4379
 
7.2%
[ 2963
 
4.9%
] 2963
 
4.9%
) 529
 
0.9%
( 529
 
0.9%
- 309
 
0.5%
, 297
 
0.5%
? 32
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 785195
99.8%
None 1427
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 125254
16.0%
i 61933
 
7.9%
n 52763
 
6.7%
o 47475
 
6.0%
44013
 
5.6%
l 41457
 
5.3%
u 38087
 
4.9%
e 37522
 
4.8%
s 35293
 
4.5%
r 27973
 
3.6%
Other values (52) 273425
34.8%
None
ValueCountFrequency (%)
ç 423
29.6%
Î 320
22.4%
é 214
15.0%
ó 162
 
11.4%
á 116
 
8.1%
â 72
 
5.0%
ñ 57
 
4.0%
ã 36
 
2.5%
í 9
 
0.6%
Ö 7
 
0.5%
Other values (4) 11
 
0.8%

countryCode
Text

Missing 

Distinct233
Distinct (%)< 0.1%
Missing10855
Missing (%)1.1%
Memory size7.5 MiB
2025-01-08T17:49:58.547478image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1955094
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowUS
2nd rowPE
3rd rowAR
4th rowVE
5th rowZA
ValueCountFrequency (%)
us 291222
29.8%
br 65995
 
6.8%
mx 63561
 
6.5%
co 36051
 
3.7%
ve 26234
 
2.7%
pe 25485
 
2.6%
ca 24554
 
2.5%
cn 23614
 
2.4%
ec 19520
 
2.0%
ph 18818
 
1.9%
Other values (223) 382493
39.1%
2025-01-08T17:49:58.915744image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 323622
16.6%
S 314243
16.1%
C 149828
 
7.7%
R 124243
 
6.4%
P 103814
 
5.3%
B 95778
 
4.9%
M 95536
 
4.9%
E 89602
 
4.6%
A 79231
 
4.1%
X 63569
 
3.3%
Other values (16) 515628
26.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1955094
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 323622
16.6%
S 314243
16.1%
C 149828
 
7.7%
R 124243
 
6.4%
P 103814
 
5.3%
B 95778
 
4.9%
M 95536
 
4.9%
E 89602
 
4.6%
A 79231
 
4.1%
X 63569
 
3.3%
Other values (16) 515628
26.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 1955094
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 323622
16.6%
S 314243
16.1%
C 149828
 
7.7%
R 124243
 
6.4%
P 103814
 
5.3%
B 95778
 
4.9%
M 95536
 
4.9%
E 89602
 
4.6%
A 79231
 
4.1%
X 63569
 
3.3%
Other values (16) 515628
26.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1955094
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 323622
16.6%
S 314243
16.1%
C 149828
 
7.7%
R 124243
 
6.4%
P 103814
 
5.3%
B 95778
 
4.9%
M 95536
 
4.9%
E 89602
 
4.6%
A 79231
 
4.1%
X 63569
 
3.3%
Other values (16) 515628
26.4%

stateProvince
Text

Missing 

Distinct3164
Distinct (%)0.4%
Missing219376
Missing (%)22.2%
Memory size7.5 MiB
2025-01-08T17:49:59.097866image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length52
Median length49
Mean length9.001383568
Min length1

Characters and Unicode

Total characters6922298
Distinct characters119
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique709 ?
Unique (%)0.1%

Sample

1st rowFlorida
2nd rowPiura
3rd rowFormosa
4th rowCarabobo
5th rowManabí
ValueCountFrequency (%)
california 44326
 
4.4%
new 23059
 
2.3%
florida 19421
 
1.9%
virginia 15940
 
1.6%
texas 15589
 
1.5%
alaska 14760
 
1.5%
amazonas 13297
 
1.3%
hawaii 12078
 
1.2%
arizona 11151
 
1.1%
san 11038
 
1.1%
Other values (2927) 831105
82.1%
2025-01-08T17:49:59.348748image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1075650
15.5%
i 567474
 
8.2%
n 508752
 
7.3%
o 506284
 
7.3%
r 439940
 
6.4%
e 348295
 
5.0%
s 278225
 
4.0%
l 274188
 
4.0%
t 243136
 
3.5%
242738
 
3.5%
Other values (109) 2437616
35.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5615213
81.1%
Uppercase Letter 1010188
 
14.6%
Space Separator 242738
 
3.5%
Dash Punctuation 25671
 
0.4%
Other Punctuation 18649
 
0.3%
Open Punctuation 4913
 
0.1%
Close Punctuation 4913
 
0.1%
Modifier Letter 13
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1075650
19.2%
i 567474
10.1%
n 508752
9.1%
o 506284
9.0%
r 439940
 
7.8%
e 348295
 
6.2%
s 278225
 
5.0%
l 274188
 
4.9%
t 243136
 
4.3%
u 231162
 
4.1%
Other values (56) 1142107
20.3%
Uppercase Letter
ValueCountFrequency (%)
C 156651
15.5%
M 98977
 
9.8%
S 83880
 
8.3%
A 79458
 
7.9%
N 66806
 
6.6%
P 56759
 
5.6%
T 43026
 
4.3%
B 39393
 
3.9%
V 39298
 
3.9%
L 34505
 
3.4%
Other values (30) 311435
30.8%
Other Punctuation
ValueCountFrequency (%)
. 13890
74.5%
/ 2984
 
16.0%
, 834
 
4.5%
' 765
 
4.1%
? 147
 
0.8%
& 29
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 4701
95.7%
[ 212
 
4.3%
Close Punctuation
ValueCountFrequency (%)
) 4701
95.7%
] 212
 
4.3%
Space Separator
ValueCountFrequency (%)
242738
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 25671
100.0%
Modifier Letter
ValueCountFrequency (%)
ʼ 13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6625401
95.7%
Common 296897
 
4.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1075650
16.2%
i 567474
 
8.6%
n 508752
 
7.7%
o 506284
 
7.6%
r 439940
 
6.6%
e 348295
 
5.3%
s 278225
 
4.2%
l 274188
 
4.1%
t 243136
 
3.7%
u 231162
 
3.5%
Other values (96) 2152295
32.5%
Common
ValueCountFrequency (%)
242738
81.8%
- 25671
 
8.6%
. 13890
 
4.7%
( 4701
 
1.6%
) 4701
 
1.6%
/ 2984
 
1.0%
, 834
 
0.3%
' 765
 
0.3%
] 212
 
0.1%
[ 212
 
0.1%
Other values (3) 189
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6837424
98.8%
None 84856
 
1.2%
Modifier Letters 13
 
< 0.1%
Latin Ext Additional 5
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1075650
15.7%
i 567474
 
8.3%
n 508752
 
7.4%
o 506284
 
7.4%
r 439940
 
6.4%
e 348295
 
5.1%
s 278225
 
4.1%
l 274188
 
4.0%
t 243136
 
3.6%
242738
 
3.6%
Other values (54) 2352742
34.4%
None
ValueCountFrequency (%)
á 30893
36.4%
í 17685
20.8%
é 13624
16.1%
ó 9696
 
11.4%
ã 4720
 
5.6%
ô 2764
 
3.3%
ñ 1309
 
1.5%
ü 921
 
1.1%
ä 569
 
0.7%
ö 452
 
0.5%
Other values (42) 2223
 
2.6%
Modifier Letters
ValueCountFrequency (%)
ʼ 13
100.0%
Latin Ext Additional
ValueCountFrequency (%)
4
80.0%
1
 
20.0%

county
Text

Missing 

Distinct7486
Distinct (%)4.6%
Missing826754
Missing (%)83.6%
Memory size7.5 MiB
2025-01-08T17:49:59.536982image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length49
Median length44
Mean length9.169770118
Min length1

Characters and Unicode

Total characters1482275
Distinct characters103
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2670 ?
Unique (%)1.7%

Sample

1st rowParroquia
2nd rowDuval
3rd rowBoulder
4th rowCantal
5th rowArlington
ValueCountFrequency (%)
county 12307
 
5.4%
san 7180
 
3.2%
prince 4211
 
1.8%
honolulu 4162
 
1.8%
santa 3941
 
1.7%
los 3095
 
1.4%
angeles 3053
 
1.3%
montgomery 3051
 
1.3%
george's 2971
 
1.3%
maui 2856
 
1.3%
Other values (6200) 181096
79.5%
2025-01-08T17:49:59.789587image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 169425
 
11.4%
o 120482
 
8.1%
n 117091
 
7.9%
e 113146
 
7.6%
r 92576
 
6.2%
i 86084
 
5.8%
t 67805
 
4.6%
u 67402
 
4.5%
66275
 
4.5%
l 63833
 
4.3%
Other values (93) 518156
35.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1186552
80.0%
Uppercase Letter 222119
 
15.0%
Space Separator 66275
 
4.5%
Other Punctuation 5600
 
0.4%
Dash Punctuation 1348
 
0.1%
Open Punctuation 169
 
< 0.1%
Close Punctuation 169
 
< 0.1%
Modifier Symbol 25
 
< 0.1%
Decimal Number 17
 
< 0.1%
Nonspacing Mark 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 169425
14.3%
o 120482
10.2%
n 117091
9.9%
e 113146
9.5%
r 92576
 
7.8%
i 86084
 
7.3%
t 67805
 
5.7%
u 67402
 
5.7%
l 63833
 
5.4%
s 53168
 
4.5%
Other values (36) 235540
19.9%
Uppercase Letter
ValueCountFrequency (%)
C 32957
14.8%
S 24240
10.9%
M 23123
 
10.4%
B 15159
 
6.8%
P 14024
 
6.3%
A 13349
 
6.0%
H 11880
 
5.3%
L 11134
 
5.0%
G 8902
 
4.0%
F 7551
 
3.4%
Other values (26) 59800
26.9%
Other Punctuation
ValueCountFrequency (%)
' 3366
60.1%
. 1205
 
21.5%
/ 820
 
14.6%
? 134
 
2.4%
, 66
 
1.2%
& 4
 
0.1%
; 2
 
< 0.1%
¡ 2
 
< 0.1%
\ 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 8
47.1%
2 7
41.2%
9 1
 
5.9%
6 1
 
5.9%
Open Punctuation
ValueCountFrequency (%)
( 156
92.3%
[ 13
 
7.7%
Close Punctuation
ValueCountFrequency (%)
) 156
92.3%
] 13
 
7.7%
Space Separator
ValueCountFrequency (%)
66275
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1348
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 25
100.0%
Nonspacing Mark
ValueCountFrequency (%)
́ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1408671
95.0%
Common 73603
 
5.0%
Inherited 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 169425
 
12.0%
o 120482
 
8.6%
n 117091
 
8.3%
e 113146
 
8.0%
r 92576
 
6.6%
i 86084
 
6.1%
t 67805
 
4.8%
u 67402
 
4.8%
l 63833
 
4.5%
s 53168
 
3.8%
Other values (72) 457659
32.5%
Common
ValueCountFrequency (%)
66275
90.0%
' 3366
 
4.6%
- 1348
 
1.8%
. 1205
 
1.6%
/ 820
 
1.1%
( 156
 
0.2%
) 156
 
0.2%
? 134
 
0.2%
, 66
 
0.1%
´ 25
 
< 0.1%
Other values (10) 52
 
0.1%
Inherited
ValueCountFrequency (%)
́ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1468756
99.1%
None 13518
 
0.9%
Diacriticals 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 169425
 
11.5%
o 120482
 
8.2%
n 117091
 
8.0%
e 113146
 
7.7%
r 92576
 
6.3%
i 86084
 
5.9%
t 67805
 
4.6%
u 67402
 
4.6%
66275
 
4.5%
l 63833
 
4.3%
Other values (60) 504637
34.4%
None
ValueCountFrequency (%)
á 2802
20.7%
é 2321
17.2%
í 2210
16.3%
ó 1905
14.1%
ã 1610
11.9%
ç 989
 
7.3%
è 288
 
2.1%
ô 265
 
2.0%
ñ 240
 
1.8%
ê 235
 
1.7%
Other values (22) 653
 
4.8%
Diacriticals
ValueCountFrequency (%)
́ 1
100.0%

locality
Text

Missing 

Distinct617492
Distinct (%)67.4%
Missing72708
Missing (%)7.4%
Memory size7.5 MiB
2025-01-08T17:50:00.086707image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length373493
Median length322
Mean length47.98270492
Min length1

Characters and Unicode

Total characters43937475
Distinct characters309
Distinct categories21 ?
Distinct scripts5 ?
Distinct blocks16 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique529251 ?
Unique (%)57.8%

Sample

1st rowGulf of Mexico
2nd rowDept. Piura: Ayabaca
3rd rowDep. Pilcomayo. al E a 2 Km de P. Porteño.
4th rowSelva siempre verde en las quebradas al norte de Los Tanques, arriba de la Planta Eléctrica, en las cabeceras del Río San Gián, al sur de Borburata.
5th rowFlat terrain near Skukuza rest camp, Kruger National Park.
ValueCountFrequency (%)
of 347842
 
5.0%
de 133785
 
1.9%
the 82580
 
1.2%
km 81320
 
1.2%
near 74890
 
1.1%
on 60171
 
0.9%
and 59914
 
0.9%
in 57394
 
0.8%
county 55824
 
0.8%
la 50630
 
0.7%
Other values (260445) 5924543
85.5%
2025-01-08T17:50:00.461812image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5977545
 
13.6%
a 3997394
 
9.1%
e 3081991
 
7.0%
o 2904902
 
6.6%
n 2422306
 
5.5%
i 2249317
 
5.1%
r 2223035
 
5.1%
t 1935539
 
4.4%
l 1565888
 
3.6%
s 1518943
 
3.5%
Other values (299) 16060615
36.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 30180500
68.7%
Space Separator 5977545
 
13.6%
Uppercase Letter 4570901
 
10.4%
Other Punctuation 2198902
 
5.0%
Decimal Number 593299
 
1.4%
Dash Punctuation 123186
 
0.3%
Control 116404
 
0.3%
Open Punctuation 80204
 
0.2%
Close Punctuation 79527
 
0.2%
Connector Punctuation 6065
 
< 0.1%
Other values (11) 10942
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3997394
13.2%
e 3081991
10.2%
o 2904902
9.6%
n 2422306
 
8.0%
i 2249317
 
7.5%
r 2223035
 
7.4%
t 1935539
 
6.4%
l 1565888
 
5.2%
s 1518943
 
5.0%
u 1126719
 
3.7%
Other values (121) 7154466
23.7%
Uppercase Letter
ValueCountFrequency (%)
C 535962
 
11.7%
S 451705
 
9.9%
M 342129
 
7.5%
P 333055
 
7.3%
R 291032
 
6.4%
B 257123
 
5.6%
A 233082
 
5.1%
N 230710
 
5.0%
L 203826
 
4.5%
T 191753
 
4.2%
Other values (64) 1500524
32.8%
Other Punctuation
ValueCountFrequency (%)
. 1227571
55.8%
, 747186
34.0%
: 104272
 
4.7%
; 41055
 
1.9%
' 32466
 
1.5%
" 21999
 
1.0%
/ 11348
 
0.5%
& 8984
 
0.4%
# 1963
 
0.1%
? 1416
 
0.1%
Other values (10) 642
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 114514
19.3%
2 87351
14.7%
0 72576
12.2%
5 66155
11.2%
3 61822
10.4%
4 50921
8.6%
6 40384
 
6.8%
8 34709
 
5.9%
7 34671
 
5.8%
9 30196
 
5.1%
Math Symbol
ValueCountFrequency (%)
= 1342
34.3%
± 1154
29.5%
+ 684
17.5%
> 257
 
6.6%
< 247
 
6.3%
~ 219
 
5.6%
| 7
 
0.2%
3
 
0.1%
3
 
0.1%
× 2
 
0.1%
Control
ValueCountFrequency (%)
115843
99.5%
522
 
0.4%
 13
 
< 0.1%
 11
 
< 0.1%
 7
 
< 0.1%
 4
 
< 0.1%
 3
 
< 0.1%
 1
 
< 0.1%
Other Number
ValueCountFrequency (%)
½ 2797
66.7%
¼ 1213
28.9%
¾ 156
 
3.7%
² 14
 
0.3%
11
 
0.3%
3
 
0.1%
2
 
< 0.1%
Format
ValueCountFrequency (%)
 2
20.0%
­ 2
20.0%
2
20.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Open Punctuation
ValueCountFrequency (%)
( 59332
74.0%
[ 20706
 
25.8%
100
 
0.1%
44
 
0.1%
{ 22
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
° 1435
99.7%
2
 
0.1%
1
 
0.1%
1
 
0.1%
¦ 1
 
0.1%
Nonspacing Mark
ValueCountFrequency (%)
ͤ 2
28.6%
̈ 2
28.6%
̋ 1
14.3%
́ 1
14.3%
1
14.3%
Modifier Symbol
ValueCountFrequency (%)
´ 101
95.3%
¨ 2
 
1.9%
^ 2
 
1.9%
˶ 1
 
0.9%
Currency Symbol
ValueCountFrequency (%)
¢ 25
59.5%
¤ 12
28.6%
$ 4
 
9.5%
£ 1
 
2.4%
Dash Punctuation
ValueCountFrequency (%)
- 123170
> 99.9%
10
 
< 0.1%
6
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 58762
73.9%
] 20740
 
26.1%
} 25
 
< 0.1%
Initial Punctuation
ValueCountFrequency (%)
« 175
82.9%
35
 
16.6%
1
 
0.5%
Final Punctuation
ValueCountFrequency (%)
» 172
92.0%
9
 
4.8%
6
 
3.2%
Modifier Letter
ValueCountFrequency (%)
ʻ 91
97.8%
1
 
1.1%
1
 
1.1%
Other Letter
ValueCountFrequency (%)
º 708
96.7%
ª 24
 
3.3%
Space Separator
ValueCountFrequency (%)
5977545
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 6065
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 34752091
79.1%
Common 9185332
 
20.9%
Greek 43
 
< 0.1%
Inherited 8
 
< 0.1%
Cyrillic 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3997394
 
11.5%
e 3081991
 
8.9%
o 2904902
 
8.4%
n 2422306
 
7.0%
i 2249317
 
6.5%
r 2223035
 
6.4%
t 1935539
 
5.6%
l 1565888
 
4.5%
s 1518943
 
4.4%
u 1126719
 
3.2%
Other values (191) 11726057
33.7%
Common
ValueCountFrequency (%)
5977545
65.1%
. 1227571
 
13.4%
, 747186
 
8.1%
- 123170
 
1.3%
115843
 
1.3%
1 114514
 
1.2%
: 104272
 
1.1%
2 87351
 
1.0%
0 72576
 
0.8%
5 66155
 
0.7%
Other values (84) 549149
 
6.0%
Greek
ValueCountFrequency (%)
λ 12
27.9%
Κ 6
14.0%
ν 6
14.0%
η 6
14.0%
υ 6
14.0%
ή 6
14.0%
Δ 1
 
2.3%
Inherited
ValueCountFrequency (%)
ͤ 2
25.0%
̈ 2
25.0%
1
12.5%
̋ 1
12.5%
́ 1
12.5%
1
12.5%
Cyrillic
ValueCountFrequency (%)
ҫ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 43737376
99.5%
None 199715
 
0.5%
Punctuation 244
 
< 0.1%
Modifier Letters 92
 
< 0.1%
Number Forms 16
 
< 0.1%
Latin Ext Additional 10
 
< 0.1%
Diacriticals 6
 
< 0.1%
Math Operators 3
 
< 0.1%
Arrows 3
 
< 0.1%
Box Drawing 3
 
< 0.1%
Other values (6) 7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5977545
 
13.7%
a 3997394
 
9.1%
e 3081991
 
7.0%
o 2904902
 
6.6%
n 2422306
 
5.5%
i 2249317
 
5.1%
r 2223035
 
5.1%
t 1935539
 
4.4%
l 1565888
 
3.6%
s 1518943
 
3.5%
Other values (87) 15860516
36.3%
None
ValueCountFrequency (%)
í 46285
23.2%
á 36812
18.4%
é 24344
12.2%
ó 20532
10.3%
ñ 10164
 
5.1%
ã 7957
 
4.0%
ú 5901
 
3.0%
ç 5120
 
2.6%
ü 4177
 
2.1%
ä 3757
 
1.9%
Other values (161) 34666
17.4%
Punctuation
ValueCountFrequency (%)
100
41.0%
44
18.0%
35
 
14.3%
26
 
10.7%
10
 
4.1%
9
 
3.7%
6
 
2.5%
6
 
2.5%
2
 
0.8%
1
 
0.4%
Other values (5) 5
 
2.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 91
98.9%
˶ 1
 
1.1%
Number Forms
ValueCountFrequency (%)
11
68.8%
3
 
18.8%
2
 
12.5%
Math Operators
ValueCountFrequency (%)
3
100.0%
Latin Ext Additional
ValueCountFrequency (%)
3
30.0%
2
20.0%
1
 
10.0%
1
 
10.0%
1
 
10.0%
1
 
10.0%
ế 1
 
10.0%
Arrows
ValueCountFrequency (%)
3
100.0%
Diacriticals
ValueCountFrequency (%)
ͤ 2
33.3%
̈ 2
33.3%
̋ 1
16.7%
́ 1
16.7%
Box Drawing
ValueCountFrequency (%)
2
66.7%
1
33.3%
IPA Ext
ValueCountFrequency (%)
ɶ 2
100.0%
Phonetic Ext Sup
ValueCountFrequency (%)
1
100.0%
Block Elements
ValueCountFrequency (%)
1
100.0%
Diacriticals Sup
ValueCountFrequency (%)
1
100.0%
Phonetic Ext
ValueCountFrequency (%)
1
100.0%
Cyrillic
ValueCountFrequency (%)
ҫ 1
100.0%

verbatimDepth
Text

Missing 

Distinct9
Distinct (%)0.2%
Missing983702
Missing (%)99.5%
Memory size7.5 MiB
2025-01-08T17:50:00.521812image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length3
Mean length3.033617021
Min length2

Characters and Unicode

Total characters14258
Distinct characters25
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)0.1%

Sample

1st rowca.
2nd rowca.
3rd rowca.
4th rowca.
5th rowca.
ValueCountFrequency (%)
ca 4691
99.3%
intertidal 11
 
0.2%
mlw 6
 
0.1%
above 4
 
0.1%
below 2
 
< 0.1%
infralittoral 1
 
< 0.1%
4-8 1
 
< 0.1%
feet 1
 
< 0.1%
mean 1
 
< 0.1%
low 1
 
< 0.1%
Other values (5) 5
 
0.1%
2025-01-08T17:50:00.620629image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4710
33.0%
c 4691
32.9%
. 4653
32.6%
t 27
 
0.2%
24
 
0.2%
l 23
 
0.2%
e 21
 
0.1%
r 14
 
0.1%
n 13
 
0.1%
i 13
 
0.1%
Other values (15) 69
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9565
67.1%
Other Punctuation 4653
32.6%
Space Separator 24
 
0.2%
Uppercase Letter 11
 
0.1%
Decimal Number 3
 
< 0.1%
Dash Punctuation 1
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4710
49.2%
c 4691
49.0%
t 27
 
0.3%
l 23
 
0.2%
e 21
 
0.2%
r 14
 
0.1%
n 13
 
0.1%
i 13
 
0.1%
d 11
 
0.1%
w 10
 
0.1%
Other values (7) 32
 
0.3%
Decimal Number
ValueCountFrequency (%)
4 1
33.3%
8 1
33.3%
1 1
33.3%
Other Punctuation
ValueCountFrequency (%)
. 4653
100.0%
Space Separator
ValueCountFrequency (%)
24
100.0%
Uppercase Letter
ValueCountFrequency (%)
I 11
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Math Symbol
ValueCountFrequency (%)
< 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9576
67.2%
Common 4682
32.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4710
49.2%
c 4691
49.0%
t 27
 
0.3%
l 23
 
0.2%
e 21
 
0.2%
r 14
 
0.1%
n 13
 
0.1%
i 13
 
0.1%
I 11
 
0.1%
d 11
 
0.1%
Other values (8) 42
 
0.4%
Common
ValueCountFrequency (%)
. 4653
99.4%
24
 
0.5%
4 1
 
< 0.1%
- 1
 
< 0.1%
8 1
 
< 0.1%
< 1
 
< 0.1%
1 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14258
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4710
33.0%
c 4691
32.9%
. 4653
32.6%
t 27
 
0.2%
24
 
0.2%
l 23
 
0.2%
e 21
 
0.1%
r 14
 
0.1%
n 13
 
0.1%
i 13
 
0.1%
Other values (15) 69
 
0.5%

decimalLatitude
Text

Missing 

Distinct30964
Distinct (%)21.0%
Missing841005
Missing (%)85.1%
Memory size7.5 MiB
2025-01-08T17:50:00.799226image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length9
Mean length5.800233383
Min length3

Characters and Unicode

Total characters854937
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16592 ?
Unique (%)11.3%

Sample

1st row26.2786
2nd row-35.57
3rd row18.6519
4th row-36.68
5th row5.86667
ValueCountFrequency (%)
38.9694 858
 
0.6%
38.895 856
 
0.6%
9.405 393
 
0.3%
0.83 372
 
0.3%
0.35 371
 
0.3%
3.61 370
 
0.3%
5.16667 340
 
0.2%
5.2 335
 
0.2%
38.8664 324
 
0.2%
12.83 312
 
0.2%
Other values (28342) 142866
96.9%
2025-01-08T17:50:01.048875image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 147397
17.2%
3 100620
11.8%
1 79516
9.3%
8 73185
8.6%
2 72959
8.5%
5 68749
8.0%
6 65516
7.7%
7 59644
7.0%
4 52733
 
6.2%
9 50290
 
5.9%
Other values (2) 84328
9.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 665840
77.9%
Other Punctuation 147397
 
17.2%
Dash Punctuation 41700
 
4.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 100620
15.1%
1 79516
11.9%
8 73185
11.0%
2 72959
11.0%
5 68749
10.3%
6 65516
9.8%
7 59644
9.0%
4 52733
7.9%
9 50290
7.6%
0 42628
6.4%
Other Punctuation
ValueCountFrequency (%)
. 147397
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 41700
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 854937
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 147397
17.2%
3 100620
11.8%
1 79516
9.3%
8 73185
8.6%
2 72959
8.5%
5 68749
8.0%
6 65516
7.7%
7 59644
7.0%
4 52733
 
6.2%
9 50290
 
5.9%
Other values (2) 84328
9.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 854937
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 147397
17.2%
3 100620
11.8%
1 79516
9.3%
8 73185
8.6%
2 72959
8.5%
5 68749
8.0%
6 65516
7.7%
7 59644
7.0%
4 52733
 
6.2%
9 50290
 
5.9%
Other values (2) 84328
9.9%

decimalLongitude
Text

Missing 

Distinct32805
Distinct (%)22.3%
Missing841005
Missing (%)85.1%
Memory size7.5 MiB
2025-01-08T17:50:01.238254image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length10
Mean length6.791515431
Min length3

Characters and Unicode

Total characters1001049
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17344 ?
Unique (%)11.8%

Sample

1st row-83.7803
2nd row137.32
3rd row-71.5572
4th row-72.97
5th row-60.5667
ValueCountFrequency (%)
77.1767 842
 
0.6%
77.0367 831
 
0.6%
59.4833 487
 
0.3%
53.2 382
 
0.3%
79.8635 382
 
0.3%
52.33 360
 
0.2%
59.48 325
 
0.2%
79.73 307
 
0.2%
88.08 302
 
0.2%
70.95 301
 
0.2%
Other values (31262) 142878
96.9%
2025-01-08T17:50:01.489341image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 147397
14.7%
- 125223
12.5%
7 109174
10.9%
1 87815
8.8%
6 84392
8.4%
5 83702
8.4%
3 71057
7.1%
8 70903
7.1%
9 59906
6.0%
2 57756
 
5.8%
Other values (2) 103724
10.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 728429
72.8%
Other Punctuation 147397
 
14.7%
Dash Punctuation 125223
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 109174
15.0%
1 87815
12.1%
6 84392
11.6%
5 83702
11.5%
3 71057
9.8%
8 70903
9.7%
9 59906
8.2%
2 57756
7.9%
4 51961
7.1%
0 51763
7.1%
Other Punctuation
ValueCountFrequency (%)
. 147397
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 125223
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1001049
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 147397
14.7%
- 125223
12.5%
7 109174
10.9%
1 87815
8.8%
6 84392
8.4%
5 83702
8.4%
3 71057
7.1%
8 70903
7.1%
9 59906
6.0%
2 57756
 
5.8%
Other values (2) 103724
10.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1001049
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 147397
14.7%
- 125223
12.5%
7 109174
10.9%
1 87815
8.8%
6 84392
8.4%
5 83702
8.4%
3 71057
7.1%
8 70903
7.1%
9 59906
6.0%
2 57756
 
5.8%
Other values (2) 103724
10.4%
Distinct20
Distinct (%)1.4%
Missing987002
Missing (%)99.9%
Memory size7.5 MiB
2025-01-08T17:50:01.558363image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.855
Min length4

Characters and Unicode

Total characters8197
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row16000.0
2nd row1500.0
3rd row250.0
4th row500.0
5th row1500.0
ValueCountFrequency (%)
16000.0 286
20.4%
1000.0 277
19.8%
500.0 234
16.7%
250.0 145
10.4%
3000.0 135
9.6%
5000.0 68
 
4.9%
750.0 67
 
4.8%
1500.0 51
 
3.6%
2000.0 38
 
2.7%
3500.0 32
 
2.3%
Other values (10) 67
 
4.8%
2025-01-08T17:50:01.669333image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 4801
58.6%
. 1400
 
17.1%
1 647
 
7.9%
5 613
 
7.5%
6 291
 
3.6%
2 196
 
2.4%
3 171
 
2.1%
7 67
 
0.8%
8 9
 
0.1%
4 2
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6797
82.9%
Other Punctuation 1400
 
17.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4801
70.6%
1 647
 
9.5%
5 613
 
9.0%
6 291
 
4.3%
2 196
 
2.9%
3 171
 
2.5%
7 67
 
1.0%
8 9
 
0.1%
4 2
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 1400
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8197
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4801
58.6%
. 1400
 
17.1%
1 647
 
7.9%
5 613
 
7.5%
6 291
 
3.6%
2 196
 
2.4%
3 171
 
2.1%
7 67
 
0.8%
8 9
 
0.1%
4 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8197
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4801
58.6%
. 1400
 
17.1%
1 647
 
7.9%
5 613
 
7.5%
6 291
 
3.6%
2 196
 
2.4%
3 171
 
2.1%
7 67
 
0.8%
8 9
 
0.1%
4 2
 
< 0.1%
Distinct4
Distinct (%)0.1%
Missing980404
Missing (%)99.2%
Memory size7.5 MiB
2025-01-08T17:50:01.718323image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length23
Mean length22.98012003
Min length4

Characters and Unicode

Total characters183795
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowDegrees Minutes Seconds
2nd rowDegrees Minutes Seconds
3rd rowDegrees Minutes Seconds
4th rowDegrees Minutes Seconds
5th rowDegrees Minutes Seconds
ValueCountFrequency (%)
degrees 7992
33.3%
minutes 7986
33.3%
seconds 7986
33.3%
decimal 6
 
< 0.1%
quad 5
 
< 0.1%
unknown 1
 
< 0.1%
2025-01-08T17:50:01.818803image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 39954
21.7%
s 23964
13.0%
15978
 
8.7%
n 15975
 
8.7%
D 7997
 
4.4%
g 7992
 
4.3%
r 7992
 
4.3%
d 7992
 
4.3%
i 7992
 
4.3%
c 7992
 
4.3%
Other values (13) 39967
21.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 143832
78.3%
Uppercase Letter 23985
 
13.0%
Space Separator 15978
 
8.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 39954
27.8%
s 23964
16.7%
n 15975
 
11.1%
g 7992
 
5.6%
r 7992
 
5.6%
d 7992
 
5.6%
i 7992
 
5.6%
c 7992
 
5.6%
o 7987
 
5.6%
t 7986
 
5.6%
Other values (6) 8006
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
D 7997
33.3%
S 7986
33.3%
M 7986
33.3%
U 6
 
< 0.1%
Q 5
 
< 0.1%
A 5
 
< 0.1%
Space Separator
ValueCountFrequency (%)
15978
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 167817
91.3%
Common 15978
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 39954
23.8%
s 23964
14.3%
n 15975
 
9.5%
D 7997
 
4.8%
g 7992
 
4.8%
r 7992
 
4.8%
d 7992
 
4.8%
i 7992
 
4.8%
c 7992
 
4.8%
o 7987
 
4.8%
Other values (12) 31980
19.1%
Common
ValueCountFrequency (%)
15978
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 183795
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 39954
21.7%
s 23964
13.0%
15978
 
8.7%
n 15975
 
8.7%
D 7997
 
4.4%
g 7992
 
4.3%
r 7992
 
4.3%
d 7992
 
4.3%
i 7992
 
4.3%
c 7992
 
4.3%
Other values (13) 39967
21.7%

verbatimSRS
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:01.862505image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters10
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row1938-11-11
ValueCountFrequency (%)
1938-11-11 1
100.0%
2025-01-08T17:50:01.955020image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 5
50.0%
- 2
 
20.0%
9 1
 
10.0%
3 1
 
10.0%
8 1
 
10.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8
80.0%
Dash Punctuation 2
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 5
62.5%
9 1
 
12.5%
3 1
 
12.5%
8 1
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 5
50.0%
- 2
 
20.0%
9 1
 
10.0%
3 1
 
10.0%
8 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 5
50.0%
- 2
 
20.0%
9 1
 
10.0%
3 1
 
10.0%
8 1
 
10.0%

footprintSRS
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:01.992019image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters3
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row315
ValueCountFrequency (%)
315 1
100.0%
2025-01-08T17:50:02.079034image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 1
33.3%
1 1
33.3%
5 1
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 1
33.3%
1 1
33.3%
5 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 3
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 1
33.3%
1 1
33.3%
5 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 1
33.3%
1 1
33.3%
5 1
33.3%

footprintSpatialFit
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:02.116035image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters3
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row315
ValueCountFrequency (%)
315 1
100.0%
2025-01-08T17:50:02.201724image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 1
33.3%
1 1
33.3%
5 1
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 1
33.3%
1 1
33.3%
5 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 3
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 1
33.3%
1 1
33.3%
5 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 1
33.3%
1 1
33.3%
5 1
33.3%

georeferencedBy
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:02.238288image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters4
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row1938
ValueCountFrequency (%)
1938 1
100.0%
2025-01-08T17:50:02.322805image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1
25.0%
9 1
25.0%
3 1
25.0%
8 1
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1
25.0%
9 1
25.0%
3 1
25.0%
8 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1
25.0%
9 1
25.0%
3 1
25.0%
8 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1
25.0%
9 1
25.0%
3 1
25.0%
8 1
25.0%

georeferencedDate
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:02.362806image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters2
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row11
ValueCountFrequency (%)
11 1
100.0%
2025-01-08T17:50:02.449165image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2
100.0%

georeferenceProtocol
Text

Missing 

Distinct20
Distinct (%)0.1%
Missing960543
Missing (%)97.2%
Memory size7.5 MiB
2025-01-08T17:50:02.498166image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length16
Mean length8.355289135
Min length2

Characters and Unicode

Total characters232770
Distinct characters40
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowGazetteer
2nd rowGazetteer
3rd rowGazetteer
4th rowGazetteer
5th rowLabel
ValueCountFrequency (%)
gazetteer 10962
30.3%
gps 5054
14.0%
gis 4557
12.6%
arcview 4557
12.6%
label 3720
 
10.3%
google 3348
 
9.3%
maps 2711
 
7.5%
earth 637
 
1.8%
source 400
 
1.1%
g-1 76
 
0.2%
Other values (11) 162
 
0.4%
2025-01-08T17:50:02.610688image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 44988
19.3%
G 23987
 
10.3%
t 22571
 
9.7%
a 18101
 
7.8%
r 16566
 
7.1%
z 10962
 
4.7%
S 9997
 
4.3%
8325
 
3.6%
o 7139
 
3.1%
l 7094
 
3.0%
Other values (30) 63040
27.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 160449
68.9%
Uppercase Letter 54726
 
23.5%
Space Separator 8325
 
3.6%
Close Punctuation 4557
 
2.0%
Open Punctuation 4557
 
2.0%
Decimal Number 78
 
< 0.1%
Dash Punctuation 76
 
< 0.1%
Other Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 44988
28.0%
t 22571
14.1%
a 18101
11.3%
r 16566
 
10.3%
z 10962
 
6.8%
o 7139
 
4.4%
l 7094
 
4.4%
c 4961
 
3.1%
i 4578
 
2.9%
w 4559
 
2.8%
Other values (12) 18930
11.8%
Uppercase Letter
ValueCountFrequency (%)
G 23987
43.8%
S 9997
18.3%
P 5040
 
9.2%
I 4557
 
8.3%
A 4557
 
8.3%
L 3715
 
6.8%
M 2172
 
4.0%
E 637
 
1.2%
W 53
 
0.1%
C 9
 
< 0.1%
Other values (2) 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
8325
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4557
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4557
100.0%
Decimal Number
ValueCountFrequency (%)
1 78
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 76
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 215175
92.4%
Common 17595
 
7.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 44988
20.9%
G 23987
11.1%
t 22571
10.5%
a 18101
 
8.4%
r 16566
 
7.7%
z 10962
 
5.1%
S 9997
 
4.6%
o 7139
 
3.3%
l 7094
 
3.3%
P 5040
 
2.3%
Other values (24) 48730
22.6%
Common
ValueCountFrequency (%)
8325
47.3%
) 4557
25.9%
( 4557
25.9%
1 78
 
0.4%
- 76
 
0.4%
. 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 232770
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 44988
19.3%
G 23987
 
10.3%
t 22571
 
9.7%
a 18101
 
7.8%
r 16566
 
7.1%
z 10962
 
4.7%
S 9997
 
4.3%
8325
 
3.6%
o 7139
 
3.1%
l 7094
 
3.0%
Other values (30) 63040
27.1%

georeferenceSources
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:02.654604image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters11
Distinct characters8
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row11 Nov 1938
ValueCountFrequency (%)
11 1
33.3%
nov 1
33.3%
1938 1
33.3%
2025-01-08T17:50:02.740978image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 3
27.3%
2
18.2%
N 1
 
9.1%
o 1
 
9.1%
v 1
 
9.1%
9 1
 
9.1%
3 1
 
9.1%
8 1
 
9.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
54.5%
Space Separator 2
 
18.2%
Lowercase Letter 2
 
18.2%
Uppercase Letter 1
 
9.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 3
50.0%
9 1
 
16.7%
3 1
 
16.7%
8 1
 
16.7%
Lowercase Letter
ValueCountFrequency (%)
o 1
50.0%
v 1
50.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Uppercase Letter
ValueCountFrequency (%)
N 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
72.7%
Latin 3
 
27.3%

Most frequent character per script

Common
ValueCountFrequency (%)
1 3
37.5%
2
25.0%
9 1
 
12.5%
3 1
 
12.5%
8 1
 
12.5%
Latin
ValueCountFrequency (%)
N 1
33.3%
o 1
33.3%
v 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 3
27.3%
2
18.2%
N 1
 
9.1%
o 1
 
9.1%
v 1
 
9.1%
9 1
 
9.1%
3 1
 
9.1%
8 1
 
9.1%

georeferenceRemarks
Text

Missing 

Distinct38
Distinct (%)33.6%
Missing988289
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:02.829933image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length53
Median length40
Mean length18.7699115
Min length3

Characters and Unicode

Total characters2121
Distinct characters54
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)18.6%

Sample

1st row+-1000m
2nd rowstop 1 - beginning of bike path, along GW pkwy
3rd rowca.; ca.
4th rowstop 1-ditch; stop 2- polkweed; stop 3; stop 4
5th rowLong. 4 8 W - 4 15 W
ValueCountFrequency (%)
stop 48
 
10.5%
4 29
 
6.3%
26
 
5.7%
ca 23
 
5.0%
w 22
 
4.8%
1 21
 
4.6%
invalid 13
 
2.8%
of 13
 
2.8%
as 13
 
2.8%
seconds 13
 
2.8%
Other values (63) 238
51.9%
2025-01-08T17:50:02.981517image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
346
16.3%
o 142
 
6.7%
n 124
 
5.8%
a 118
 
5.6%
t 118
 
5.6%
e 116
 
5.5%
i 113
 
5.3%
s 94
 
4.4%
p 82
 
3.9%
l 81
 
3.8%
Other values (44) 787
37.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1365
64.4%
Space Separator 346
 
16.3%
Decimal Number 140
 
6.6%
Uppercase Letter 140
 
6.6%
Other Punctuation 82
 
3.9%
Dash Punctuation 42
 
2.0%
Math Symbol 6
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 142
10.4%
n 124
9.1%
a 118
 
8.6%
t 118
 
8.6%
e 116
 
8.5%
i 113
 
8.3%
s 94
 
6.9%
p 82
 
6.0%
l 81
 
5.9%
d 75
 
5.5%
Other values (13) 302
22.1%
Uppercase Letter
ValueCountFrequency (%)
W 35
25.0%
S 25
17.9%
L 15
10.7%
G 14
 
10.0%
T 8
 
5.7%
M 7
 
5.0%
U 7
 
5.0%
C 6
 
4.3%
V 5
 
3.6%
F 4
 
2.9%
Other values (5) 14
 
10.0%
Decimal Number
ValueCountFrequency (%)
1 44
31.4%
4 33
23.6%
0 18
12.9%
8 15
 
10.7%
5 12
 
8.6%
3 9
 
6.4%
2 8
 
5.7%
6 1
 
0.7%
Other Punctuation
ValueCountFrequency (%)
. 48
58.5%
; 21
25.6%
, 9
 
11.0%
/ 3
 
3.7%
' 1
 
1.2%
Space Separator
ValueCountFrequency (%)
346
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 42
100.0%
Math Symbol
ValueCountFrequency (%)
+ 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1505
71.0%
Common 616
29.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 142
 
9.4%
n 124
 
8.2%
a 118
 
7.8%
t 118
 
7.8%
e 116
 
7.7%
i 113
 
7.5%
s 94
 
6.2%
p 82
 
5.4%
l 81
 
5.4%
d 75
 
5.0%
Other values (28) 442
29.4%
Common
ValueCountFrequency (%)
346
56.2%
. 48
 
7.8%
1 44
 
7.1%
- 42
 
6.8%
4 33
 
5.4%
; 21
 
3.4%
0 18
 
2.9%
8 15
 
2.4%
5 12
 
1.9%
, 9
 
1.5%
Other values (6) 28
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2121
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
346
16.3%
o 142
 
6.7%
n 124
 
5.8%
a 118
 
5.6%
t 118
 
5.6%
e 116
 
5.5%
i 113
 
5.3%
s 94
 
4.4%
p 82
 
3.9%
l 81
 
3.8%
Other values (44) 787
37.1%

latestEpochOrHighestSeries
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:03.036684image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length42
Median length42
Mean length42
Min length42

Characters and Unicode

Total characters42
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowSouth America - Neotropics, Colombia, Meta
ValueCountFrequency (%)
south 1
16.7%
america 1
16.7%
1
16.7%
neotropics 1
16.7%
colombia 1
16.7%
meta 1
16.7%
2025-01-08T17:50:03.135481image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5
11.9%
o 5
11.9%
t 3
 
7.1%
e 3
 
7.1%
i 3
 
7.1%
a 3
 
7.1%
c 2
 
4.8%
m 2
 
4.8%
r 2
 
4.8%
, 2
 
4.8%
Other values (12) 12
28.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 29
69.0%
Space Separator 5
 
11.9%
Uppercase Letter 5
 
11.9%
Other Punctuation 2
 
4.8%
Dash Punctuation 1
 
2.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 5
17.2%
t 3
10.3%
e 3
10.3%
i 3
10.3%
a 3
10.3%
c 2
 
6.9%
m 2
 
6.9%
r 2
 
6.9%
p 1
 
3.4%
b 1
 
3.4%
Other values (4) 4
13.8%
Uppercase Letter
ValueCountFrequency (%)
C 1
20.0%
S 1
20.0%
N 1
20.0%
A 1
20.0%
M 1
20.0%
Space Separator
ValueCountFrequency (%)
5
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 34
81.0%
Common 8
 
19.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 5
14.7%
t 3
 
8.8%
e 3
 
8.8%
i 3
 
8.8%
a 3
 
8.8%
c 2
 
5.9%
m 2
 
5.9%
r 2
 
5.9%
p 1
 
2.9%
b 1
 
2.9%
Other values (9) 9
26.5%
Common
ValueCountFrequency (%)
5
62.5%
, 2
 
25.0%
- 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5
11.9%
o 5
11.9%
t 3
 
7.1%
e 3
 
7.1%
i 3
 
7.1%
a 3
 
7.1%
c 2
 
4.8%
m 2
 
4.8%
r 2
 
4.8%
, 2
 
4.8%
Other values (12) 12
28.6%

earliestAgeOrLowestStage
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:03.180016image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters13
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowSOUTH_AMERICA
ValueCountFrequency (%)
south_america 1
100.0%
2025-01-08T17:50:03.271015image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 2
15.4%
S 1
7.7%
O 1
7.7%
U 1
7.7%
T 1
7.7%
H 1
7.7%
_ 1
7.7%
M 1
7.7%
E 1
7.7%
R 1
7.7%
Other values (2) 2
15.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 12
92.3%
Connector Punctuation 1
 
7.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 2
16.7%
S 1
8.3%
O 1
8.3%
U 1
8.3%
T 1
8.3%
H 1
8.3%
M 1
8.3%
E 1
8.3%
R 1
8.3%
I 1
8.3%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
92.3%
Common 1
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 2
16.7%
S 1
8.3%
O 1
8.3%
U 1
8.3%
T 1
8.3%
H 1
8.3%
M 1
8.3%
E 1
8.3%
R 1
8.3%
I 1
8.3%
Common
ValueCountFrequency (%)
_ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 2
15.4%
S 1
7.7%
O 1
7.7%
U 1
7.7%
T 1
7.7%
H 1
7.7%
_ 1
7.7%
M 1
7.7%
E 1
7.7%
R 1
7.7%
Other values (2) 2
15.4%

lowestBiostratigraphicZone
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:03.309016image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row7296210
ValueCountFrequency (%)
7296210 1
100.0%
2025-01-08T17:50:03.392616image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2
28.6%
7 1
14.3%
9 1
14.3%
6 1
14.3%
1 1
14.3%
0 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2
28.6%
7 1
14.3%
9 1
14.3%
6 1
14.3%
1 1
14.3%
0 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Common 7
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2
28.6%
7 1
14.3%
9 1
14.3%
6 1
14.3%
1 1
14.3%
0 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2
28.6%
7 1
14.3%
9 1
14.3%
6 1
14.3%
1 1
14.3%
0 1
14.3%

lithostratigraphicTerms
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:03.430323image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters2
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowCO
ValueCountFrequency (%)
co 1
100.0%
2025-01-08T17:50:03.514076image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 1
50.0%
O 1
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 1
50.0%
O 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 1
50.0%
O 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 1
50.0%
O 1
50.0%

group
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:03.554175image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters4
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowMeta
ValueCountFrequency (%)
meta 1
100.0%
2025-01-08T17:50:03.637568image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 1
25.0%
e 1
25.0%
t 1
25.0%
a 1
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3
75.0%
Uppercase Letter 1
 
25.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1
33.3%
t 1
33.3%
a 1
33.3%
Uppercase Letter
ValueCountFrequency (%)
M 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 1
25.0%
e 1
25.0%
t 1
25.0%
a 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 1
25.0%
e 1
25.0%
t 1
25.0%
a 1
25.0%

bed
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing988400
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:03.681587image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length23.5
Mean length23.5
Min length15

Characters and Unicode

Total characters47
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowRinorea pubiflora var. pubiflora
2nd rowVilla Vicencia.
ValueCountFrequency (%)
pubiflora 2
33.3%
rinorea 1
16.7%
var 1
16.7%
villa 1
16.7%
vicencia 1
16.7%
2025-01-08T17:50:03.787547image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 6
12.8%
i 6
12.8%
r 4
 
8.5%
4
 
8.5%
l 4
 
8.5%
o 3
 
6.4%
p 2
 
4.3%
f 2
 
4.3%
V 2
 
4.3%
. 2
 
4.3%
Other values (7) 12
25.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 38
80.9%
Space Separator 4
 
8.5%
Uppercase Letter 3
 
6.4%
Other Punctuation 2
 
4.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6
15.8%
i 6
15.8%
r 4
10.5%
l 4
10.5%
o 3
7.9%
p 2
 
5.3%
f 2
 
5.3%
c 2
 
5.3%
b 2
 
5.3%
u 2
 
5.3%
Other values (3) 5
13.2%
Uppercase Letter
ValueCountFrequency (%)
V 2
66.7%
R 1
33.3%
Space Separator
ValueCountFrequency (%)
4
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 41
87.2%
Common 6
 
12.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6
14.6%
i 6
14.6%
r 4
9.8%
l 4
9.8%
o 3
 
7.3%
p 2
 
4.9%
f 2
 
4.9%
V 2
 
4.9%
c 2
 
4.9%
b 2
 
4.9%
Other values (5) 8
19.5%
Common
ValueCountFrequency (%)
4
66.7%
. 2
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 47
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 6
12.8%
i 6
12.8%
r 4
 
8.5%
4
 
8.5%
l 4
 
8.5%
o 3
 
6.4%
p 2
 
4.3%
f 2
 
4.3%
V 2
 
4.3%
. 2
 
4.3%
Other values (7) 12
25.5%
Distinct16
Distinct (%)0.7%
Missing985985
Missing (%)99.8%
Memory size7.5 MiB
2025-01-08T17:50:03.832047image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length3
Mean length4.395531651
Min length2

Characters and Unicode

Total characters10624
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)0.1%

Sample

1st rowcf.
2nd rowcf.
3rd rowcf.
4th rowvel aff.
5th rowvel aff.
ValueCountFrequency (%)
cf 1295
51.4%
aff 610
24.2%
uncertain 368
 
14.6%
s.l 125
 
5.0%
vel 77
 
3.1%
sp 15
 
0.6%
near 13
 
0.5%
nov 13
 
0.5%
s.s 5
 
0.2%
2025-01-08T17:50:04.038837image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
f 2515
23.7%
. 2176
20.5%
c 1663
15.7%
a 991
 
9.3%
n 762
 
7.2%
e 458
 
4.3%
r 381
 
3.6%
i 368
 
3.5%
t 368
 
3.5%
u 365
 
3.4%
Other values (7) 577
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8341
78.5%
Other Punctuation 2176
 
20.5%
Space Separator 104
 
1.0%
Uppercase Letter 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 2515
30.2%
c 1663
19.9%
a 991
 
11.9%
n 762
 
9.1%
e 458
 
5.5%
r 381
 
4.6%
i 368
 
4.4%
t 368
 
4.4%
u 365
 
4.4%
l 202
 
2.4%
Other values (4) 268
 
3.2%
Other Punctuation
ValueCountFrequency (%)
. 2176
100.0%
Space Separator
ValueCountFrequency (%)
104
100.0%
Uppercase Letter
ValueCountFrequency (%)
U 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8344
78.5%
Common 2280
 
21.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 2515
30.1%
c 1663
19.9%
a 991
 
11.9%
n 762
 
9.1%
e 458
 
5.5%
r 381
 
4.6%
i 368
 
4.4%
t 368
 
4.4%
u 365
 
4.4%
l 202
 
2.4%
Other values (5) 271
 
3.2%
Common
ValueCountFrequency (%)
. 2176
95.4%
104
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10624
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 2515
23.7%
. 2176
20.5%
c 1663
15.7%
a 991
 
9.3%
n 762
 
7.2%
e 458
 
4.3%
r 381
 
3.6%
i 368
 
3.5%
t 368
 
3.5%
u 365
 
3.4%
Other values (7) 577
 
5.4%

typeStatus
Text

Missing 

Distinct13
Distinct (%)0.1%
Missing967033
Missing (%)97.8%
Memory size7.5 MiB
2025-01-08T17:50:04.088837image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length7
Mean length7.474893537
Min length4

Characters and Unicode

Total characters159731
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowISOTYPE
2nd rowISOTYPE
3rd rowHOLOTYPE
4th rowISOTYPE
5th rowISOTYPE
ValueCountFrequency (%)
isotype 13211
61.8%
holotype 4263
 
19.9%
isosyntype 1377
 
6.4%
syntype 1202
 
5.6%
type 444
 
2.1%
isolectotype 434
 
2.0%
lectotype 195
 
0.9%
isoneotype 97
 
0.5%
paratype 75
 
0.4%
neotype 46
 
0.2%
Other values (3) 25
 
0.1%
2025-01-08T17:50:04.194087image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
O 24437
15.3%
Y 23932
15.0%
E 22146
13.9%
T 21998
13.8%
P 21437
13.4%
S 17702
11.1%
I 15176
9.5%
L 4924
 
3.1%
H 4263
 
2.7%
N 2738
 
1.7%
Other values (5) 978
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 159731
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O 24437
15.3%
Y 23932
15.0%
E 22146
13.9%
T 21998
13.8%
P 21437
13.4%
S 17702
11.1%
I 15176
9.5%
L 4924
 
3.1%
H 4263
 
2.7%
N 2738
 
1.7%
Other values (5) 978
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 159731
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
O 24437
15.3%
Y 23932
15.0%
E 22146
13.9%
T 21998
13.8%
P 21437
13.4%
S 17702
11.1%
I 15176
9.5%
L 4924
 
3.1%
H 4263
 
2.7%
N 2738
 
1.7%
Other values (5) 978
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 159731
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O 24437
15.3%
Y 23932
15.0%
E 22146
13.9%
T 21998
13.8%
P 21437
13.4%
S 17702
11.1%
I 15176
9.5%
L 4924
 
3.1%
H 4263
 
2.7%
N 2738
 
1.7%
Other values (5) 978
 
0.6%

identifiedBy
Text

Missing 

Distinct4879
Distinct (%)4.0%
Missing866335
Missing (%)87.7%
Memory size7.5 MiB
2025-01-08T17:50:04.366360image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length131
Median length108
Mean length37.6902275
Min length3

Characters and Unicode

Total characters4600733
Distinct characters91
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1781 ?
Unique (%)1.5%

Sample

1st rowBlair, S. M.
2nd rowAcevedo-Rodríguez, P., (BOT), Smithsonian Institution - National Museum of Natural History (UNITED STATES)
3rd rowAcevedo-Rodríguez, P., (BOT), Smithsonian Institution - National Museum of Natural History (UNITED STATES)
4th rowWagner, W. L., (BOT), Smithsonian Institution - National Museum of Natural History (UNITED STATES)
5th rowWagner, W. L., (BOT), Smithsonian Institution - National Museum of Natural History (UNITED STATES)
ValueCountFrequency (%)
united 29831
 
4.2%
states 29821
 
4.2%
of 27693
 
3.9%
27094
 
3.8%
national 26421
 
3.7%
museum 26206
 
3.6%
smithsonian 26080
 
3.6%
natural 26024
 
3.6%
history 26005
 
3.6%
institution 26002
 
3.6%
Other values (4341) 447270
62.3%
2025-01-08T17:50:04.622151image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
596380
 
13.0%
t 259398
 
5.6%
a 250409
 
5.4%
o 245191
 
5.3%
i 229198
 
5.0%
n 225319
 
4.9%
, 198332
 
4.3%
. 187693
 
4.1%
r 186814
 
4.1%
e 183391
 
4.0%
Other values (81) 2038608
44.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2399509
52.2%
Uppercase Letter 1037834
22.6%
Space Separator 596380
 
13.0%
Other Punctuation 392196
 
8.5%
Open Punctuation 70430
 
1.5%
Close Punctuation 70430
 
1.5%
Dash Punctuation 33950
 
0.7%
Decimal Number 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 259398
10.8%
a 250409
10.4%
o 245191
10.2%
i 229198
9.6%
n 225319
9.4%
r 186814
7.8%
e 183391
7.6%
u 150111
 
6.3%
s 150086
 
6.3%
l 109220
 
4.6%
Other values (33) 410372
17.1%
Uppercase Letter
ValueCountFrequency (%)
S 134600
13.0%
T 126793
12.2%
N 98928
 
9.5%
E 85767
 
8.3%
I 62658
 
6.0%
A 62027
 
6.0%
M 58662
 
5.7%
D 56356
 
5.4%
U 48305
 
4.7%
H 46164
 
4.4%
Other values (20) 257574
24.8%
Other Punctuation
ValueCountFrequency (%)
, 198332
50.6%
. 187693
47.9%
; 5701
 
1.5%
" 272
 
0.1%
' 140
 
< 0.1%
& 42
 
< 0.1%
¡ 12
 
< 0.1%
? 3
 
< 0.1%
/ 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
9 2
50.0%
1 1
25.0%
2 1
25.0%
Open Punctuation
ValueCountFrequency (%)
( 69931
99.3%
[ 499
 
0.7%
Close Punctuation
ValueCountFrequency (%)
) 69931
99.3%
] 499
 
0.7%
Space Separator
ValueCountFrequency (%)
596380
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 33950
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3437343
74.7%
Common 1163390
 
25.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 259398
 
7.5%
a 250409
 
7.3%
o 245191
 
7.1%
i 229198
 
6.7%
n 225319
 
6.6%
r 186814
 
5.4%
e 183391
 
5.3%
u 150111
 
4.4%
s 150086
 
4.4%
S 134600
 
3.9%
Other values (63) 1422826
41.4%
Common
ValueCountFrequency (%)
596380
51.3%
, 198332
 
17.0%
. 187693
 
16.1%
( 69931
 
6.0%
) 69931
 
6.0%
- 33950
 
2.9%
; 5701
 
0.5%
[ 499
 
< 0.1%
] 499
 
< 0.1%
" 272
 
< 0.1%
Other values (8) 202
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4593697
99.8%
None 7036
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
596380
 
13.0%
t 259398
 
5.6%
a 250409
 
5.5%
o 245191
 
5.3%
i 229198
 
5.0%
n 225319
 
4.9%
, 198332
 
4.3%
. 187693
 
4.1%
r 186814
 
4.1%
e 183391
 
4.0%
Other values (59) 2031572
44.2%
None
ValueCountFrequency (%)
í 3953
56.2%
á 810
 
11.5%
é 712
 
10.1%
ñ 329
 
4.7%
ö 311
 
4.4%
ü 236
 
3.4%
ó 220
 
3.1%
ä 193
 
2.7%
ã 71
 
1.0%
ú 64
 
0.9%
Other values (12) 137
 
1.9%

dateIdentified
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:04.681104image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length59
Median length59
Mean length59
Min length59

Characters and Unicode

Total characters59
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPlantae, Dicotyledonae, Malpighiales, Violaceae, Violoideae
ValueCountFrequency (%)
plantae 1
20.0%
dicotyledonae 1
20.0%
malpighiales 1
20.0%
violaceae 1
20.0%
violoideae 1
20.0%
2025-01-08T17:50:04.778483image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8
13.6%
e 8
13.6%
i 6
10.2%
l 6
10.2%
o 5
8.5%
, 4
 
6.8%
4
 
6.8%
c 2
 
3.4%
d 2
 
3.4%
V 2
 
3.4%
Other values (10) 12
20.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 46
78.0%
Uppercase Letter 5
 
8.5%
Other Punctuation 4
 
6.8%
Space Separator 4
 
6.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8
17.4%
e 8
17.4%
i 6
13.0%
l 6
13.0%
o 5
10.9%
c 2
 
4.3%
d 2
 
4.3%
t 2
 
4.3%
n 2
 
4.3%
y 1
 
2.2%
Other values (4) 4
8.7%
Uppercase Letter
ValueCountFrequency (%)
V 2
40.0%
D 1
20.0%
M 1
20.0%
P 1
20.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 51
86.4%
Common 8
 
13.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8
15.7%
e 8
15.7%
i 6
11.8%
l 6
11.8%
o 5
9.8%
c 2
 
3.9%
d 2
 
3.9%
V 2
 
3.9%
t 2
 
3.9%
n 2
 
3.9%
Other values (8) 8
15.7%
Common
ValueCountFrequency (%)
, 4
50.0%
4
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 59
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8
13.6%
e 8
13.6%
i 6
10.2%
l 6
10.2%
o 5
8.5%
, 4
 
6.8%
4
 
6.8%
c 2
 
3.4%
d 2
 
3.4%
V 2
 
3.4%
Other values (10) 12
20.3%

identificationReferences
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:04.820526image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPlantae
ValueCountFrequency (%)
plantae 1
100.0%
2025-01-08T17:50:04.906124image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2
28.6%
P 1
14.3%
l 1
14.3%
n 1
14.3%
t 1
14.3%
e 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
85.7%
Uppercase Letter 1
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
33.3%
l 1
16.7%
n 1
16.7%
t 1
16.7%
e 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
P 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
28.6%
P 1
14.3%
l 1
14.3%
n 1
14.3%
t 1
14.3%
e 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2
28.6%
P 1
14.3%
l 1
14.3%
n 1
14.3%
t 1
14.3%
e 1
14.3%

identificationVerificationStatus
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:04.948828image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters12
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowTracheophyta
ValueCountFrequency (%)
tracheophyta 1
100.0%
2025-01-08T17:50:05.038412image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2
16.7%
h 2
16.7%
T 1
8.3%
r 1
8.3%
c 1
8.3%
e 1
8.3%
o 1
8.3%
p 1
8.3%
y 1
8.3%
t 1
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11
91.7%
Uppercase Letter 1
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
18.2%
h 2
18.2%
r 1
9.1%
c 1
9.1%
e 1
9.1%
o 1
9.1%
p 1
9.1%
y 1
9.1%
t 1
9.1%
Uppercase Letter
ValueCountFrequency (%)
T 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
16.7%
h 2
16.7%
T 1
8.3%
r 1
8.3%
c 1
8.3%
e 1
8.3%
o 1
8.3%
p 1
8.3%
y 1
8.3%
t 1
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2
16.7%
h 2
16.7%
T 1
8.3%
r 1
8.3%
c 1
8.3%
e 1
8.3%
o 1
8.3%
p 1
8.3%
y 1
8.3%
t 1
8.3%

identificationRemarks
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:05.083411image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters13
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowMagnoliopsida
ValueCountFrequency (%)
magnoliopsida 1
100.0%
2025-01-08T17:50:05.177107image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2
15.4%
o 2
15.4%
i 2
15.4%
M 1
7.7%
g 1
7.7%
n 1
7.7%
l 1
7.7%
p 1
7.7%
s 1
7.7%
d 1
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12
92.3%
Uppercase Letter 1
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
16.7%
o 2
16.7%
i 2
16.7%
g 1
8.3%
n 1
8.3%
l 1
8.3%
p 1
8.3%
s 1
8.3%
d 1
8.3%
Uppercase Letter
ValueCountFrequency (%)
M 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
15.4%
o 2
15.4%
i 2
15.4%
M 1
7.7%
g 1
7.7%
n 1
7.7%
l 1
7.7%
p 1
7.7%
s 1
7.7%
d 1
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2
15.4%
o 2
15.4%
i 2
15.4%
M 1
7.7%
g 1
7.7%
n 1
7.7%
l 1
7.7%
p 1
7.7%
s 1
7.7%
d 1
7.7%

taxonID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:05.221330image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters12
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowMalpighiales
ValueCountFrequency (%)
malpighiales 1
100.0%
2025-01-08T17:50:05.311052image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2
16.7%
l 2
16.7%
i 2
16.7%
M 1
8.3%
p 1
8.3%
g 1
8.3%
h 1
8.3%
e 1
8.3%
s 1
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11
91.7%
Uppercase Letter 1
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
18.2%
l 2
18.2%
i 2
18.2%
p 1
9.1%
g 1
9.1%
h 1
9.1%
e 1
9.1%
s 1
9.1%
Uppercase Letter
ValueCountFrequency (%)
M 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
16.7%
l 2
16.7%
i 2
16.7%
M 1
8.3%
p 1
8.3%
g 1
8.3%
h 1
8.3%
e 1
8.3%
s 1
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2
16.7%
l 2
16.7%
i 2
16.7%
M 1
8.3%
p 1
8.3%
g 1
8.3%
h 1
8.3%
e 1
8.3%
s 1
8.3%
Distinct141149
Distinct (%)14.3%
Missing3368
Missing (%)0.3%
Memory size7.5 MiB
2025-01-08T17:50:05.521653image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length7
Mean length7.000691347
Min length1

Characters and Unicode

Total characters6895919
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52485 ?
Unique (%)5.3%

Sample

1st row2654944
2nd row2947270
3rd row10416230
4th row3687053
5th row7355530
ValueCountFrequency (%)
7947184 4001
 
0.4%
2655370 1415
 
0.1%
6 1163
 
0.1%
3219107 1082
 
0.1%
5426909 1064
 
0.1%
2702678 1008
 
0.1%
5426949 994
 
0.1%
2654909 868
 
0.1%
2655497 809
 
0.1%
5426932 760
 
0.1%
Other values (141139) 971870
98.7%
2025-01-08T17:50:05.806396image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 898604
13.0%
3 797962
11.6%
7 733166
10.6%
5 715553
10.4%
0 648866
9.4%
1 638677
9.3%
8 635333
9.2%
6 626367
9.1%
9 613562
8.9%
4 587820
8.5%
Other values (7) 9
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6895910
> 99.9%
Lowercase Letter 8
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 898604
13.0%
3 797962
11.6%
7 733166
10.6%
5 715553
10.4%
0 648866
9.4%
1 638677
9.3%
8 635333
9.2%
6 626367
9.1%
9 613562
8.9%
4 587820
8.5%
Lowercase Letter
ValueCountFrequency (%)
a 2
25.0%
e 2
25.0%
i 1
12.5%
o 1
12.5%
l 1
12.5%
c 1
12.5%
Uppercase Letter
ValueCountFrequency (%)
V 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6895910
> 99.9%
Latin 9
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 898604
13.0%
3 797962
11.6%
7 733166
10.6%
5 715553
10.4%
0 648866
9.4%
1 638677
9.3%
8 635333
9.2%
6 626367
9.1%
9 613562
8.9%
4 587820
8.5%
Latin
ValueCountFrequency (%)
a 2
22.2%
e 2
22.2%
V 1
11.1%
i 1
11.1%
o 1
11.1%
l 1
11.1%
c 1
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6895919
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 898604
13.0%
3 797962
11.6%
7 733166
10.6%
5 715553
10.4%
0 648866
9.4%
1 638677
9.3%
8 635333
9.2%
6 626367
9.1%
9 613562
8.9%
4 587820
8.5%
Other values (7) 9
 
< 0.1%

namePublishedInID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:05.858480image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowRinorea
ValueCountFrequency (%)
rinorea 1
100.0%
2025-01-08T17:50:05.943686image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 1
14.3%
i 1
14.3%
n 1
14.3%
o 1
14.3%
r 1
14.3%
e 1
14.3%
a 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
85.7%
Uppercase Letter 1
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1
16.7%
n 1
16.7%
o 1
16.7%
r 1
16.7%
e 1
16.7%
a 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
R 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 1
14.3%
i 1
14.3%
n 1
14.3%
o 1
14.3%
r 1
14.3%
e 1
14.3%
a 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 1
14.3%
i 1
14.3%
n 1
14.3%
o 1
14.3%
r 1
14.3%
e 1
14.3%
a 1
14.3%

taxonConceptID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:05.982687image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowRinorea
ValueCountFrequency (%)
rinorea 1
100.0%
2025-01-08T17:50:06.068636image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 1
14.3%
i 1
14.3%
n 1
14.3%
o 1
14.3%
r 1
14.3%
e 1
14.3%
a 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
85.7%
Uppercase Letter 1
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1
16.7%
n 1
16.7%
o 1
16.7%
r 1
16.7%
e 1
16.7%
a 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
R 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 1
14.3%
i 1
14.3%
n 1
14.3%
o 1
14.3%
r 1
14.3%
e 1
14.3%
a 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 1
14.3%
i 1
14.3%
n 1
14.3%
o 1
14.3%
r 1
14.3%
e 1
14.3%
a 1
14.3%
Distinct171484
Distinct (%)17.3%
Missing3
Missing (%)< 0.1%
Memory size7.5 MiB
2025-01-08T17:50:06.260575image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length145
Median length90
Mean length31.1400224
Min length5

Characters and Unicode

Total characters30778767
Distinct characters118
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique76155 ?
Unique (%)7.7%

Sample

1st rowLithothamnion calcareum (Pallas) Areschoug
2nd rowAmicia glandulosa Kunth
3rd rowTripogandra glandulosa (Seub.) Rohweder
4th rowConnarus steyermarkii Prance
5th rowTrichoneura grandiglumis (Nees) Ekman
ValueCountFrequency (%)
l 155063
 
4.1%
123403
 
3.2%
ex 71536
 
1.9%
var 42961
 
1.1%
kunth 25715
 
0.7%
dc 25369
 
0.7%
benth 22482
 
0.6%
a.gray 22453
 
0.6%
subsp 20360
 
0.5%
sw 19134
 
0.5%
Other values (72296) 3270675
86.1%
2025-01-08T17:50:06.542368image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2810752
 
9.1%
a 2784580
 
9.0%
i 2163680
 
7.0%
e 1918469
 
6.2%
r 1705055
 
5.5%
l 1505257
 
4.9%
o 1502652
 
4.9%
n 1425654
 
4.6%
s 1389295
 
4.5%
. 1385249
 
4.5%
Other values (108) 12188124
39.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 22543815
73.2%
Uppercase Letter 2972512
 
9.7%
Space Separator 2810752
 
9.1%
Other Punctuation 1548996
 
5.0%
Open Punctuation 381562
 
1.2%
Close Punctuation 381562
 
1.2%
Decimal Number 125780
 
0.4%
Dash Punctuation 11127
 
< 0.1%
Math Symbol 2640
 
< 0.1%
Connector Punctuation 21
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2784580
12.4%
i 2163680
 
9.6%
e 1918469
 
8.5%
r 1705055
 
7.6%
l 1505257
 
6.7%
o 1502652
 
6.7%
n 1425654
 
6.3%
s 1389295
 
6.2%
u 1233185
 
5.5%
t 1179720
 
5.2%
Other values (51) 5736268
25.4%
Uppercase Letter
ValueCountFrequency (%)
L 300531
 
10.1%
S 279166
 
9.4%
C 270055
 
9.1%
P 213524
 
7.2%
A 207708
 
7.0%
M 200072
 
6.7%
B 195748
 
6.6%
H 174615
 
5.9%
R 140797
 
4.7%
D 140382
 
4.7%
Other values (27) 849914
28.6%
Decimal Number
ValueCountFrequency (%)
1 36281
28.8%
8 25539
20.3%
9 16579
13.2%
2 7723
 
6.1%
3 7598
 
6.0%
7 7534
 
6.0%
0 7406
 
5.9%
4 6583
 
5.2%
6 5657
 
4.5%
5 4880
 
3.9%
Other Punctuation
ValueCountFrequency (%)
. 1385249
89.4%
& 123403
 
8.0%
, 38852
 
2.5%
' 1492
 
0.1%
Space Separator
ValueCountFrequency (%)
2810752
100.0%
Open Punctuation
ValueCountFrequency (%)
( 381562
100.0%
Close Punctuation
ValueCountFrequency (%)
) 381562
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11127
100.0%
Math Symbol
ValueCountFrequency (%)
× 2640
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 25516327
82.9%
Common 5262440
 
17.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2784580
 
10.9%
i 2163680
 
8.5%
e 1918469
 
7.5%
r 1705055
 
6.7%
l 1505257
 
5.9%
o 1502652
 
5.9%
n 1425654
 
5.6%
s 1389295
 
5.4%
u 1233185
 
4.8%
t 1179720
 
4.6%
Other values (88) 8708780
34.1%
Common
ValueCountFrequency (%)
2810752
53.4%
. 1385249
26.3%
( 381562
 
7.3%
) 381562
 
7.3%
& 123403
 
2.3%
, 38852
 
0.7%
1 36281
 
0.7%
8 25539
 
0.5%
9 16579
 
0.3%
- 11127
 
0.2%
Other values (10) 51534
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30731098
99.8%
None 47669
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2810752
 
9.1%
a 2784580
 
9.1%
i 2163680
 
7.0%
e 1918469
 
6.2%
r 1705055
 
5.5%
l 1505257
 
4.9%
o 1502652
 
4.9%
n 1425654
 
4.6%
s 1389295
 
4.5%
. 1385249
 
4.5%
Other values (61) 12140455
39.5%
None
ValueCountFrequency (%)
ü 15283
32.1%
é 9353
19.6%
ö 6844
14.4%
× 2640
 
5.5%
ä 2495
 
5.2%
á 2384
 
5.0%
Á 1742
 
3.7%
ø 1175
 
2.5%
è 888
 
1.9%
ó 870
 
1.8%
Other values (37) 3995
 
8.4%

parentNameUsage
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:06.593368image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowpubiflora
ValueCountFrequency (%)
pubiflora 1
100.0%
2025-01-08T17:50:06.681840image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
p 1
11.1%
u 1
11.1%
b 1
11.1%
i 1
11.1%
f 1
11.1%
l 1
11.1%
o 1
11.1%
r 1
11.1%
a 1
11.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
p 1
11.1%
u 1
11.1%
b 1
11.1%
i 1
11.1%
f 1
11.1%
l 1
11.1%
o 1
11.1%
r 1
11.1%
a 1
11.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 9
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
p 1
11.1%
u 1
11.1%
b 1
11.1%
i 1
11.1%
f 1
11.1%
l 1
11.1%
o 1
11.1%
r 1
11.1%
a 1
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
p 1
11.1%
u 1
11.1%
b 1
11.1%
i 1
11.1%
f 1
11.1%
l 1
11.1%
o 1
11.1%
r 1
11.1%
a 1
11.1%

originalNameUsage
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:06.721903image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowpubiflora
ValueCountFrequency (%)
pubiflora 1
100.0%
2025-01-08T17:50:06.809533image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
p 1
11.1%
u 1
11.1%
b 1
11.1%
i 1
11.1%
f 1
11.1%
l 1
11.1%
o 1
11.1%
r 1
11.1%
a 1
11.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
p 1
11.1%
u 1
11.1%
b 1
11.1%
i 1
11.1%
f 1
11.1%
l 1
11.1%
o 1
11.1%
r 1
11.1%
a 1
11.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 9
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
p 1
11.1%
u 1
11.1%
b 1
11.1%
i 1
11.1%
f 1
11.1%
l 1
11.1%
o 1
11.1%
r 1
11.1%
a 1
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
p 1
11.1%
u 1
11.1%
b 1
11.1%
i 1
11.1%
f 1
11.1%
l 1
11.1%
o 1
11.1%
r 1
11.1%
a 1
11.1%

namePublishedIn
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:06.847855image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowVARIETY
ValueCountFrequency (%)
variety 1
100.0%
2025-01-08T17:50:06.935148image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
V 1
14.3%
A 1
14.3%
R 1
14.3%
I 1
14.3%
E 1
14.3%
T 1
14.3%
Y 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 7
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
V 1
14.3%
A 1
14.3%
R 1
14.3%
I 1
14.3%
E 1
14.3%
T 1
14.3%
Y 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
V 1
14.3%
A 1
14.3%
R 1
14.3%
I 1
14.3%
E 1
14.3%
T 1
14.3%
Y 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
V 1
14.3%
A 1
14.3%
R 1
14.3%
I 1
14.3%
E 1
14.3%
T 1
14.3%
Y 1
14.3%
Distinct1871
Distinct (%)0.2%
Missing3060
Missing (%)0.3%
Memory size7.5 MiB
2025-01-08T17:50:07.097100image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length106
Median length83
Mean length55.78930767
Min length6

Characters and Unicode

Total characters54971548
Distinct characters60
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique247 ?
Unique (%)< 0.1%

Sample

1st rowPlantae, Rhodophyta, Corallinales, Lithothamniaceae
2nd rowPlantae, Dicotyledonae, Fabales, Fabaceae, Papilionoideae
3rd rowPlantae, Monocotyledonae, Commelinales, Commelinaceae
4th rowPlantae, Dicotyledonae, Oxalidales, Connaraceae
5th rowPlantae, Monocotyledonae, Poales, Poaceae, Chloridoideae
ValueCountFrequency (%)
plantae 906960
 
19.6%
dicotyledonae 565444
 
12.2%
monocotyledonae 198988
 
4.3%
poales 153711
 
3.3%
poaceae 110119
 
2.4%
asterales 83265
 
1.8%
asteraceae 78409
 
1.7%
asteroideae 62020
 
1.3%
pteridophyte 60609
 
1.3%
lamiales 58285
 
1.3%
Other values (1989) 2357982
50.9%
2025-01-08T17:50:07.337968image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 7742674
14.1%
e 7698551
14.0%
o 4128767
 
7.5%
3650450
 
6.6%
, 3626340
 
6.6%
l 3562838
 
6.5%
n 2796731
 
5.1%
t 2746790
 
5.0%
i 2727025
 
5.0%
c 2484340
 
4.5%
Other values (50) 13807042
25.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 43041381
78.3%
Uppercase Letter 4611679
 
8.4%
Space Separator 3650450
 
6.6%
Other Punctuation 3630777
 
6.6%
Close Punctuation 18607
 
< 0.1%
Open Punctuation 18607
 
< 0.1%
Dash Punctuation 47
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 7742674
18.0%
e 7698551
17.9%
o 4128767
9.6%
l 3562838
8.3%
n 2796731
 
6.5%
t 2746790
 
6.4%
i 2727025
 
6.3%
c 2484340
 
5.8%
s 1741258
 
4.0%
d 1740291
 
4.0%
Other values (17) 5672116
13.2%
Uppercase Letter
ValueCountFrequency (%)
P 1556974
33.8%
D 610750
 
13.2%
A 433977
 
9.4%
M 401816
 
8.7%
C 333336
 
7.2%
L 197314
 
4.3%
F 191584
 
4.2%
R 167750
 
3.6%
B 159640
 
3.5%
S 141071
 
3.1%
Other values (16) 417467
 
9.1%
Other Punctuation
ValueCountFrequency (%)
, 3626340
99.9%
. 4436
 
0.1%
? 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
3650450
100.0%
Close Punctuation
ValueCountFrequency (%)
) 18607
100.0%
Open Punctuation
ValueCountFrequency (%)
( 18607
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 47
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 47653060
86.7%
Common 7318488
 
13.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 7742674
16.2%
e 7698551
16.2%
o 4128767
8.7%
l 3562838
 
7.5%
n 2796731
 
5.9%
t 2746790
 
5.8%
i 2727025
 
5.7%
c 2484340
 
5.2%
s 1741258
 
3.7%
d 1740291
 
3.7%
Other values (43) 10283795
21.6%
Common
ValueCountFrequency (%)
3650450
49.9%
, 3626340
49.6%
) 18607
 
0.3%
( 18607
 
0.3%
. 4436
 
0.1%
- 47
 
< 0.1%
? 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 54971425
> 99.9%
None 123
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 7742674
14.1%
e 7698551
14.0%
o 4128767
 
7.5%
3650450
 
6.6%
, 3626340
 
6.6%
l 3562838
 
6.5%
n 2796731
 
5.1%
t 2746790
 
5.0%
i 2727025
 
5.0%
c 2484340
 
4.5%
Other values (49) 13806919
25.1%
None
ValueCountFrequency (%)
ö 123
100.0%
Distinct7
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size7.5 MiB
2025-01-08T17:50:07.395804image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length7
Mean length6.971155373
Min length5

Characters and Unicode

Total characters6890283
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPlantae
2nd rowPlantae
3rd rowPlantae
4th rowPlantae
5th rowPlantae
ValueCountFrequency (%)
plantae 907311
91.5%
fungi 48945
 
4.9%
chromista 17041
 
1.7%
bacteria 11701
 
1.2%
incertae 3366
 
0.3%
sedis 3366
 
0.3%
protozoa 31
 
< 0.1%
animalia 4
 
< 0.1%
2025-01-08T17:50:07.489066image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1858470
27.0%
n 959626
13.9%
t 939450
13.6%
e 929110
13.5%
P 907342
13.2%
l 907315
13.2%
i 84427
 
1.2%
F 48945
 
0.7%
u 48945
 
0.7%
g 48945
 
0.7%
Other values (12) 157708
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5901884
85.7%
Uppercase Letter 985033
 
14.3%
Space Separator 3366
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1858470
31.5%
n 959626
16.3%
t 939450
15.9%
e 929110
15.7%
l 907315
15.4%
i 84427
 
1.4%
u 48945
 
0.8%
g 48945
 
0.8%
r 32139
 
0.5%
s 23773
 
0.4%
Other values (6) 69684
 
1.2%
Uppercase Letter
ValueCountFrequency (%)
P 907342
92.1%
F 48945
 
5.0%
C 17041
 
1.7%
B 11701
 
1.2%
A 4
 
< 0.1%
Space Separator
ValueCountFrequency (%)
3366
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6886917
> 99.9%
Common 3366
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1858470
27.0%
n 959626
13.9%
t 939450
13.6%
e 929110
13.5%
P 907342
13.2%
l 907315
13.2%
i 84427
 
1.2%
F 48945
 
0.7%
u 48945
 
0.7%
g 48945
 
0.7%
Other values (11) 154342
 
2.2%
Common
ValueCountFrequency (%)
3366
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6890283
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1858470
27.0%
n 959626
13.9%
t 939450
13.6%
e 929110
13.5%
P 907342
13.2%
l 907315
13.2%
i 84427
 
1.2%
F 48945
 
0.7%
u 48945
 
0.7%
g 48945
 
0.7%
Other values (12) 157708
 
2.3%

phylum
Text

Distinct24
Distinct (%)< 0.1%
Missing4754
Missing (%)0.5%
Memory size7.5 MiB
2025-01-08T17:50:07.541303image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length12
Mean length11.72722051
Min length7

Characters and Unicode

Total characters11535457
Distinct characters32
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowRhodophyta
2nd rowTracheophyta
3rd rowTracheophyta
4th rowTracheophyta
5th rowTracheophyta
ValueCountFrequency (%)
tracheophyta 830617
84.4%
ascomycota 48276
 
4.9%
bryophyta 32695
 
3.3%
rhodophyta 26385
 
2.7%
ochrophyta 15149
 
1.5%
cyanobacteria 11694
 
1.2%
chlorophyta 9268
 
0.9%
marchantiophyta 5937
 
0.6%
myzozoa 1887
 
0.2%
charophyta 1126
 
0.1%
Other values (14) 614
 
0.1%
2025-01-08T17:50:07.642297image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1851109
16.0%
h 1809899
15.7%
o 1070187
9.3%
y 1016309
8.8%
t 987912
8.6%
c 960529
8.3%
p 921305
8.0%
r 906622
7.9%
e 842467
7.3%
T 830618
7.2%
Other values (22) 338500
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10551802
91.5%
Uppercase Letter 983655
 
8.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1851109
17.5%
h 1809899
17.2%
o 1070187
10.1%
y 1016309
9.6%
t 987912
9.4%
c 960529
9.1%
p 921305
8.7%
r 906622
8.6%
e 842467
8.0%
m 48739
 
0.5%
Other values (10) 136724
 
1.3%
Uppercase Letter
ValueCountFrequency (%)
T 830618
84.4%
A 48406
 
4.9%
B 33146
 
3.4%
R 26385
 
2.7%
C 22097
 
2.2%
O 15149
 
1.5%
M 7826
 
0.8%
E 19
 
< 0.1%
P 5
 
< 0.1%
F 2
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 11535457
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1851109
16.0%
h 1809899
15.7%
o 1070187
9.3%
y 1016309
8.8%
t 987912
8.6%
c 960529
8.3%
p 921305
8.0%
r 906622
7.9%
e 842467
7.3%
T 830618
7.2%
Other values (22) 338500
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11535457
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1851109
16.0%
h 1809899
15.7%
o 1070187
9.3%
y 1016309
8.8%
t 987912
8.6%
c 960529
8.3%
p 921305
8.0%
r 906622
7.9%
e 842467
7.3%
T 830618
7.2%
Other values (22) 338500
 
2.9%

class
Text

Distinct68
Distinct (%)< 0.1%
Missing5481
Missing (%)0.6%
Memory size7.5 MiB
2025-01-08T17:50:07.711101image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length13
Mean length12.51019767
Min length6

Characters and Unicode

Total characters12296536
Distinct characters42
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)< 0.1%

Sample

1st rowFlorideophyceae
2nd rowMagnoliopsida
3rd rowLiliopsida
4th rowMagnoliopsida
5th rowLiliopsida
ValueCountFrequency (%)
magnoliopsida 565617
57.5%
liliopsida 199036
 
20.2%
polypodiopsida 54963
 
5.6%
lecanoromycetes 44421
 
4.5%
bryopsida 29396
 
3.0%
florideophyceae 25770
 
2.6%
cyanobacteriia 11282
 
1.1%
bacillariophyceae 8448
 
0.9%
ulvophyceae 8422
 
0.9%
phaeophyceae 6544
 
0.7%
Other values (58) 29022
 
3.0%
2025-01-08T17:50:07.838864image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 1977833
16.1%
o 1746862
14.2%
a 1601551
13.0%
p 985390
8.0%
d 957361
7.8%
s 918228
7.5%
l 873356
7.1%
n 649522
 
5.3%
g 573852
 
4.7%
M 566314
 
4.6%
Other values (32) 1446267
11.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11313615
92.0%
Uppercase Letter 982921
 
8.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1977833
17.5%
o 1746862
15.4%
a 1601551
14.2%
p 985390
8.7%
d 957361
8.5%
s 918228
8.1%
l 873356
7.7%
n 649522
 
5.7%
g 573852
 
5.1%
e 300397
 
2.7%
Other values (13) 729263
 
6.4%
Uppercase Letter
ValueCountFrequency (%)
M 566314
57.6%
L 249439
25.4%
P 67683
 
6.9%
B 38181
 
3.9%
F 25770
 
2.6%
C 13423
 
1.4%
U 8422
 
0.9%
J 5240
 
0.5%
D 2331
 
0.2%
A 2061
 
0.2%
Other values (9) 4057
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 12296536
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1977833
16.1%
o 1746862
14.2%
a 1601551
13.0%
p 985390
8.0%
d 957361
7.8%
s 918228
7.5%
l 873356
7.1%
n 649522
 
5.3%
g 573852
 
4.7%
M 566314
 
4.6%
Other values (32) 1446267
11.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12296536
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1977833
16.1%
o 1746862
14.2%
a 1601551
13.0%
p 985390
8.0%
d 957361
7.8%
s 918228
7.5%
l 873356
7.1%
n 649522
 
5.3%
g 573852
 
4.7%
M 566314
 
4.6%
Other values (32) 1446267
11.8%

order
Text

Missing 

Distinct357
Distinct (%)< 0.1%
Missing10135
Missing (%)1.0%
Memory size7.5 MiB
2025-01-08T17:50:07.999635image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length18
Mean length9.357003763
Min length6

Characters and Unicode

Total characters9153648
Distinct characters49
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38 ?
Unique (%)< 0.1%

Sample

1st rowCorallinales
2nd rowFabales
3rd rowCommelinales
4th rowOxalidales
5th rowPoales
ValueCountFrequency (%)
poales 153750
 
15.7%
asterales 83320
 
8.5%
lamiales 58318
 
6.0%
fabales 55218
 
5.6%
malpighiales 46323
 
4.7%
polypodiales 42295
 
4.3%
gentianales 39541
 
4.0%
myrtales 34933
 
3.6%
caryophyllales 32482
 
3.3%
rosales 28326
 
2.9%
Other values (347) 403761
41.3%
2025-01-08T17:50:08.236669image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1510235
16.5%
l 1284489
14.0%
e 1227651
13.4%
s 1170802
12.8%
i 502041
 
5.5%
o 443593
 
4.8%
r 374076
 
4.1%
n 276338
 
3.0%
t 238442
 
2.6%
P 224898
 
2.5%
Other values (39) 1901083
20.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8175381
89.3%
Uppercase Letter 978267
 
10.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1510235
18.5%
l 1284489
15.7%
e 1227651
15.0%
s 1170802
14.3%
i 502041
 
6.1%
o 443593
 
5.4%
r 374076
 
4.6%
n 276338
 
3.4%
t 238442
 
2.9%
p 219045
 
2.7%
Other values (15) 928669
11.4%
Uppercase Letter
ValueCountFrequency (%)
P 224898
23.0%
A 127170
13.0%
M 105522
10.8%
L 101257
10.4%
C 84594
 
8.6%
F 64567
 
6.6%
S 54625
 
5.6%
G 49863
 
5.1%
R 40816
 
4.2%
E 31238
 
3.2%
Other values (14) 93717
9.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 9153648
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1510235
16.5%
l 1284489
14.0%
e 1227651
13.4%
s 1170802
12.8%
i 502041
 
5.5%
o 443593
 
4.8%
r 374076
 
4.1%
n 276338
 
3.0%
t 238442
 
2.6%
P 224898
 
2.5%
Other values (39) 1901083
20.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9153648
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1510235
16.5%
l 1284489
14.0%
e 1227651
13.4%
s 1170802
12.8%
i 502041
 
5.5%
o 443593
 
4.8%
r 374076
 
4.1%
n 276338
 
3.0%
t 238442
 
2.6%
P 224898
 
2.5%
Other values (39) 1901083
20.8%

superfamily
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:08.300669image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters36
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row821cc27a-e3bb-4bc5-ac34-89ada245069d
ValueCountFrequency (%)
821cc27a-e3bb-4bc5-ac34-89ada245069d 1
100.0%
2025-01-08T17:50:08.400010image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 4
11.1%
a 4
11.1%
- 4
11.1%
2 3
8.3%
b 3
8.3%
4 3
8.3%
8 2
 
5.6%
3 2
 
5.6%
5 2
 
5.6%
9 2
 
5.6%
Other values (6) 7
19.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 18
50.0%
Lowercase Letter 14
38.9%
Dash Punctuation 4
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 3
16.7%
4 3
16.7%
8 2
11.1%
3 2
11.1%
5 2
11.1%
9 2
11.1%
1 1
 
5.6%
7 1
 
5.6%
0 1
 
5.6%
6 1
 
5.6%
Lowercase Letter
ValueCountFrequency (%)
c 4
28.6%
a 4
28.6%
b 3
21.4%
d 2
14.3%
e 1
 
7.1%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 22
61.1%
Latin 14
38.9%

Most frequent character per script

Common
ValueCountFrequency (%)
- 4
18.2%
2 3
13.6%
4 3
13.6%
8 2
9.1%
3 2
9.1%
5 2
9.1%
9 2
9.1%
1 1
 
4.5%
7 1
 
4.5%
0 1
 
4.5%
Latin
ValueCountFrequency (%)
c 4
28.6%
a 4
28.6%
b 3
21.4%
d 2
14.3%
e 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 4
11.1%
a 4
11.1%
- 4
11.1%
2 3
8.3%
b 3
8.3%
4 3
8.3%
8 2
 
5.6%
3 2
 
5.6%
5 2
 
5.6%
9 2
 
5.6%
Other values (6) 7
19.4%

family
Text

Missing 

Distinct1293
Distinct (%)0.1%
Missing10432
Missing (%)1.1%
Memory size7.5 MiB
2025-01-08T17:50:08.545927image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length20
Mean length10.76219925
Min length2

Characters and Unicode

Total characters10525108
Distinct characters52
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique122 ?
Unique (%)< 0.1%

Sample

1st rowHapalidiaceae
2nd rowFabaceae
3rd rowCommelinaceae
4th rowConnaraceae
5th rowPoaceae
ValueCountFrequency (%)
poaceae 110118
 
11.3%
asteraceae 78427
 
8.0%
fabaceae 51638
 
5.3%
cyperaceae 30498
 
3.1%
rubiaceae 26201
 
2.7%
melastomataceae 16271
 
1.7%
malvaceae 14761
 
1.5%
rosaceae 14530
 
1.5%
parmeliaceae 14370
 
1.5%
lamiaceae 13720
 
1.4%
Other values (1283) 607436
62.1%
2025-01-08T17:50:08.752903image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2413737
22.9%
e 2328328
22.1%
c 1165771
11.1%
i 467905
 
4.4%
r 452986
 
4.3%
o 448954
 
4.3%
l 342205
 
3.3%
t 293841
 
2.8%
n 274374
 
2.6%
s 219276
 
2.1%
Other values (42) 2117731
20.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9547099
90.7%
Uppercase Letter 977990
 
9.3%
Connector Punctuation 19
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2413737
25.3%
e 2328328
24.4%
c 1165771
12.2%
i 467905
 
4.9%
r 452986
 
4.7%
o 448954
 
4.7%
l 342205
 
3.6%
t 293841
 
3.1%
n 274374
 
2.9%
s 219276
 
2.3%
Other values (16) 1139722
11.9%
Uppercase Letter
ValueCountFrequency (%)
P 210065
21.5%
A 147452
15.1%
C 108066
11.0%
M 66499
 
6.8%
R 64279
 
6.6%
F 58492
 
6.0%
S 51973
 
5.3%
L 43467
 
4.4%
B 36909
 
3.8%
O 31591
 
3.2%
Other values (15) 159197
16.3%
Connector Punctuation
ValueCountFrequency (%)
_ 19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10525089
> 99.9%
Common 19
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2413737
22.9%
e 2328328
22.1%
c 1165771
11.1%
i 467905
 
4.4%
r 452986
 
4.3%
o 448954
 
4.3%
l 342205
 
3.3%
t 293841
 
2.8%
n 274374
 
2.6%
s 219276
 
2.1%
Other values (41) 2117712
20.1%
Common
ValueCountFrequency (%)
_ 19
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10525108
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2413737
22.9%
e 2328328
22.1%
c 1165771
11.1%
i 467905
 
4.4%
r 452986
 
4.3%
o 448954
 
4.3%
l 342205
 
3.3%
t 293841
 
2.8%
n 274374
 
2.6%
s 219276
 
2.1%
Other values (42) 2117731
20.1%

subfamily
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:08.811902image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters24
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row2024-12-02T13:57:09.776Z
ValueCountFrequency (%)
2024-12-02t13:57:09.776z 1
100.0%
2025-01-08T17:50:08.904314image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 4
16.7%
0 3
12.5%
7 3
12.5%
- 2
8.3%
1 2
8.3%
: 2
8.3%
4 1
 
4.2%
T 1
 
4.2%
3 1
 
4.2%
5 1
 
4.2%
Other values (4) 4
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 17
70.8%
Other Punctuation 3
 
12.5%
Dash Punctuation 2
 
8.3%
Uppercase Letter 2
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 4
23.5%
0 3
17.6%
7 3
17.6%
1 2
11.8%
4 1
 
5.9%
3 1
 
5.9%
5 1
 
5.9%
9 1
 
5.9%
6 1
 
5.9%
Other Punctuation
ValueCountFrequency (%)
: 2
66.7%
. 1
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 1
50.0%
Z 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 22
91.7%
Latin 2
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 4
18.2%
0 3
13.6%
7 3
13.6%
- 2
9.1%
1 2
9.1%
: 2
9.1%
4 1
 
4.5%
3 1
 
4.5%
5 1
 
4.5%
9 1
 
4.5%
Other values (2) 2
9.1%
Latin
ValueCountFrequency (%)
T 1
50.0%
Z 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 4
16.7%
0 3
12.5%
7 3
12.5%
- 2
8.3%
1 2
8.3%
: 2
8.3%
4 1
 
4.2%
T 1
 
4.2%
3 1
 
4.2%
5 1
 
4.2%
Other values (4) 4
16.7%

tribe
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:08.942729image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row450.0
ValueCountFrequency (%)
450.0 1
100.0%
2025-01-08T17:50:09.027791image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2
40.0%
4 1
20.0%
5 1
20.0%
. 1
20.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4
80.0%
Other Punctuation 1
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2
50.0%
4 1
25.0%
5 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2
40.0%
4 1
20.0%
5 1
20.0%
. 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2
40.0%
4 1
20.0%
5 1
20.0%
. 1
20.0%

subtribe
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:09.065791image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters4
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row50.0
ValueCountFrequency (%)
50.0 1
100.0%
2025-01-08T17:50:09.150712image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2
50.0%
5 1
25.0%
. 1
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3
75.0%
Other Punctuation 1
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2
66.7%
5 1
33.3%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2
50.0%
5 1
25.0%
. 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2
50.0%
5 1
25.0%
. 1
25.0%

genus
Text

Missing 

Distinct14195
Distinct (%)1.5%
Missing15345
Missing (%)1.6%
Memory size7.5 MiB
2025-01-08T17:50:09.301746image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length19
Mean length8.8481127
Min length2

Characters and Unicode

Total characters8609718
Distinct characters53
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2072 ?
Unique (%)0.2%

Sample

1st rowPhymatolithon
2nd rowAmicia
3rd rowCallisia
4th rowConnarus
5th rowTrichoneura
ValueCountFrequency (%)
carex 12742
 
1.3%
miconia 8772
 
0.9%
cladonia 6873
 
0.7%
poa 6684
 
0.7%
cyperus 6044
 
0.6%
paspalum 5820
 
0.6%
solanum 5538
 
0.6%
eragrostis 5205
 
0.5%
dichanthelium 4464
 
0.5%
asplenium 4297
 
0.4%
Other values (14184) 906618
93.2%
2025-01-08T17:50:09.643014image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1065944
 
12.4%
i 802101
 
9.3%
o 610885
 
7.1%
e 599768
 
7.0%
r 564074
 
6.6%
l 476062
 
5.5%
s 450892
 
5.2%
n 446312
 
5.2%
u 428361
 
5.0%
t 360702
 
4.2%
Other values (43) 2804617
32.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7636370
88.7%
Uppercase Letter 973079
 
11.3%
Dash Punctuation 269
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1065944
14.0%
i 802101
10.5%
o 610885
 
8.0%
e 599768
 
7.9%
r 564074
 
7.4%
l 476062
 
6.2%
s 450892
 
5.9%
n 446312
 
5.8%
u 428361
 
5.6%
t 360702
 
4.7%
Other values (16) 1831269
24.0%
Uppercase Letter
ValueCountFrequency (%)
C 136320
14.0%
P 129899
13.3%
S 98619
10.1%
A 86400
 
8.9%
M 64229
 
6.6%
E 54716
 
5.6%
L 48991
 
5.0%
D 46834
 
4.8%
H 40071
 
4.1%
B 40048
 
4.1%
Other values (16) 226952
23.3%
Dash Punctuation
ValueCountFrequency (%)
- 269
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8609449
> 99.9%
Common 269
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1065944
 
12.4%
i 802101
 
9.3%
o 610885
 
7.1%
e 599768
 
7.0%
r 564074
 
6.6%
l 476062
 
5.5%
s 450892
 
5.2%
n 446312
 
5.2%
u 428361
 
5.0%
t 360702
 
4.2%
Other values (42) 2804348
32.6%
Common
ValueCountFrequency (%)
- 269
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8609718
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1065944
 
12.4%
i 802101
 
9.3%
o 610885
 
7.1%
e 599768
 
7.0%
r 564074
 
6.6%
l 476062
 
5.5%
s 450892
 
5.2%
n 446312
 
5.2%
u 428361
 
5.0%
t 360702
 
4.2%
Other values (43) 2804617
32.6%

genericName
Text

Missing 

Distinct15150
Distinct (%)1.6%
Missing15400
Missing (%)1.6%
Memory size7.5 MiB
2025-01-08T17:50:09.824497image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length17
Mean length8.785601674
Min length2

Characters and Unicode

Total characters8548408
Distinct characters54
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2933 ?
Unique (%)0.3%

Sample

1st rowLithothamnion
2nd rowAmicia
3rd rowTripogandra
4th rowConnarus
5th rowTrichoneura
ValueCountFrequency (%)
carex 12732
 
1.3%
poa 6687
 
0.7%
cyperus 6038
 
0.6%
cladonia 5891
 
0.6%
paspalum 5802
 
0.6%
miconia 5466
 
0.6%
solanum 5416
 
0.6%
eragrostis 5200
 
0.5%
asplenium 4423
 
0.5%
dichanthelium 4230
 
0.4%
Other values (15139) 911117
93.6%
2025-01-08T17:50:10.067772image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1056368
 
12.4%
i 790190
 
9.2%
o 599730
 
7.0%
e 592542
 
6.9%
r 561767
 
6.6%
l 470384
 
5.5%
s 445526
 
5.2%
n 443561
 
5.2%
u 432434
 
5.1%
t 358837
 
4.2%
Other values (44) 2797069
32.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7575383
88.6%
Uppercase Letter 973005
 
11.4%
Dash Punctuation 20
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1056368
13.9%
i 790190
10.4%
o 599730
 
7.9%
e 592542
 
7.8%
r 561767
 
7.4%
l 470384
 
6.2%
s 445526
 
5.9%
n 443561
 
5.9%
u 432434
 
5.7%
t 358837
 
4.7%
Other values (17) 1824044
24.1%
Uppercase Letter
ValueCountFrequency (%)
C 139989
14.4%
P 128503
13.2%
S 96475
9.9%
A 88126
 
9.1%
M 61343
 
6.3%
E 52472
 
5.4%
L 52006
 
5.3%
D 45953
 
4.7%
B 42140
 
4.3%
H 40016
 
4.1%
Other values (16) 225982
23.2%
Dash Punctuation
ValueCountFrequency (%)
- 20
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8548388
> 99.9%
Common 20
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1056368
 
12.4%
i 790190
 
9.2%
o 599730
 
7.0%
e 592542
 
6.9%
r 561767
 
6.6%
l 470384
 
5.5%
s 445526
 
5.2%
n 443561
 
5.2%
u 432434
 
5.1%
t 358837
 
4.2%
Other values (43) 2797049
32.7%
Common
ValueCountFrequency (%)
- 20
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8548392
> 99.9%
None 16
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1056368
 
12.4%
i 790190
 
9.2%
o 599730
 
7.0%
e 592542
 
6.9%
r 561767
 
6.6%
l 470384
 
5.5%
s 445526
 
5.2%
n 443561
 
5.2%
u 432434
 
5.1%
t 358837
 
4.2%
Other values (43) 2797053
32.7%
None
ValueCountFrequency (%)
ë 16
100.0%

infragenericEpithet
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:10.130241image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length48
Median length48
Mean length48
Min length48

Characters and Unicode

Total characters48
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
ValueCountFrequency (%)
occurrence_status_inferred_from_individual_count 1
100.0%
2025-01-08T17:50:10.231370image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 5
10.4%
_ 5
10.4%
C 4
8.3%
U 4
8.3%
E 4
8.3%
N 4
8.3%
I 4
8.3%
O 3
 
6.2%
T 3
 
6.2%
D 3
 
6.2%
Other values (6) 9
18.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 43
89.6%
Connector Punctuation 5
 
10.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 5
11.6%
C 4
9.3%
U 4
9.3%
E 4
9.3%
N 4
9.3%
I 4
9.3%
O 3
7.0%
T 3
7.0%
D 3
7.0%
S 2
 
4.7%
Other values (5) 7
16.3%
Connector Punctuation
ValueCountFrequency (%)
_ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 43
89.6%
Common 5
 
10.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 5
11.6%
C 4
9.3%
U 4
9.3%
E 4
9.3%
N 4
9.3%
I 4
9.3%
O 3
7.0%
T 3
7.0%
D 3
7.0%
S 2
 
4.7%
Other values (5) 7
16.3%
Common
ValueCountFrequency (%)
_ 5
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 48
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 5
10.4%
_ 5
10.4%
C 4
8.3%
U 4
8.3%
E 4
8.3%
N 4
8.3%
I 4
8.3%
O 3
 
6.2%
T 3
 
6.2%
D 3
 
6.2%
Other values (6) 9
18.8%

specificEpithet
Text

Missing 

Distinct44923
Distinct (%)4.9%
Missing75483
Missing (%)7.6%
Memory size7.5 MiB
2025-01-08T17:50:10.423005image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length22
Mean length9.15062344
Min length3

Characters and Unicode

Total characters8353778
Distinct characters32
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14847 ?
Unique (%)1.6%

Sample

1st rowcalcareum
2nd rowglandulosa
3rd rowglandulosa
4th rowsteyermarkii
5th rowgrandiglumis
ValueCountFrequency (%)
canadensis 2613
 
0.3%
guianensis 2604
 
0.3%
americana 2509
 
0.3%
latifolia 2449
 
0.3%
parviflora 2235
 
0.2%
repens 2200
 
0.2%
gracilis 2040
 
0.2%
occidentalis 2004
 
0.2%
indica 1946
 
0.2%
pubescens 1937
 
0.2%
Other values (44913) 890382
97.5%
2025-01-08T17:50:10.679235image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1131933
13.5%
i 964476
11.5%
s 606683
 
7.3%
e 594963
 
7.1%
r 547391
 
6.6%
l 544114
 
6.5%
n 520530
 
6.2%
u 490822
 
5.9%
o 487476
 
5.8%
t 439575
 
5.3%
Other values (22) 2025815
24.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8349817
> 99.9%
Dash Punctuation 3959
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1131933
13.6%
i 964476
11.6%
s 606683
 
7.3%
e 594963
 
7.1%
r 547391
 
6.6%
l 544114
 
6.5%
n 520530
 
6.2%
u 490822
 
5.9%
o 487476
 
5.8%
t 439575
 
5.3%
Other values (19) 2021854
24.2%
Uppercase Letter
ValueCountFrequency (%)
S 1
50.0%
I 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 3959
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8349819
> 99.9%
Common 3959
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1131933
13.6%
i 964476
11.6%
s 606683
 
7.3%
e 594963
 
7.1%
r 547391
 
6.6%
l 544114
 
6.5%
n 520530
 
6.2%
u 490822
 
5.9%
o 487476
 
5.8%
t 439575
 
5.3%
Other values (21) 2021856
24.2%
Common
ValueCountFrequency (%)
- 3959
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8353755
> 99.9%
None 23
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1131933
13.5%
i 964476
11.5%
s 606683
 
7.3%
e 594963
 
7.1%
r 547391
 
6.6%
l 544114
 
6.5%
n 520530
 
6.2%
u 490822
 
5.9%
o 487476
 
5.8%
t 439575
 
5.3%
Other values (19) 2025792
24.3%
None
ValueCountFrequency (%)
ï 15
65.2%
ë 6
 
26.1%
ü 2
 
8.7%

infraspecificEpithet
Text

Missing 

Distinct6984
Distinct (%)10.8%
Missing923675
Missing (%)93.5%
Memory size7.5 MiB
2025-01-08T17:50:10.870913image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length19
Mean length9.201986806
Min length4

Characters and Unicode

Total characters595617
Distinct characters28
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2531 ?
Unique (%)3.9%

Sample

1st rowoxyphylla
2nd rowsubalpinum
3rd rowpubescens
4th rowhirsuta
5th rowcrispa
ValueCountFrequency (%)
acuminatum 942
 
1.5%
pubescens 386
 
0.6%
secunda 352
 
0.5%
dichotomum 328
 
0.5%
gracilis 322
 
0.5%
americana 321
 
0.5%
angustifolia 270
 
0.4%
glauca 264
 
0.4%
occidentalis 234
 
0.4%
mexicana 225
 
0.3%
Other values (6974) 61083
94.4%
2025-01-08T17:50:11.113371image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 81071
13.6%
i 68606
11.5%
s 43652
 
7.3%
e 41408
 
7.0%
l 40211
 
6.8%
n 37437
 
6.3%
r 36574
 
6.1%
u 35997
 
6.0%
o 33783
 
5.7%
t 30496
 
5.1%
Other values (18) 146382
24.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 595493
> 99.9%
Dash Punctuation 124
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 81071
13.6%
i 68606
11.5%
s 43652
 
7.3%
e 41408
 
7.0%
l 40211
 
6.8%
n 37437
 
6.3%
r 36574
 
6.1%
u 35997
 
6.0%
o 33783
 
5.7%
t 30496
 
5.1%
Other values (17) 146258
24.6%
Dash Punctuation
ValueCountFrequency (%)
- 124
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 595493
> 99.9%
Common 124
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 81071
13.6%
i 68606
11.5%
s 43652
 
7.3%
e 41408
 
7.0%
l 40211
 
6.8%
n 37437
 
6.3%
r 36574
 
6.1%
u 35997
 
6.0%
o 33783
 
5.7%
t 30496
 
5.1%
Other values (17) 146258
24.6%
Common
ValueCountFrequency (%)
- 124
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 595616
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 81071
13.6%
i 68606
11.5%
s 43652
 
7.3%
e 41408
 
7.0%
l 40211
 
6.8%
n 37437
 
6.3%
r 36574
 
6.1%
u 35997
 
6.0%
o 33783
 
5.7%
t 30496
 
5.1%
Other values (17) 146381
24.6%
None
ValueCountFrequency (%)
ë 1
100.0%

cultivarEpithet
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:11.164626image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowfalse
ValueCountFrequency (%)
false 1
100.0%
2025-01-08T17:50:11.250051image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
f 1
20.0%
a 1
20.0%
l 1
20.0%
s 1
20.0%
e 1
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 1
20.0%
a 1
20.0%
l 1
20.0%
s 1
20.0%
e 1
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 1
20.0%
a 1
20.0%
l 1
20.0%
s 1
20.0%
e 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 1
20.0%
a 1
20.0%
l 1
20.0%
s 1
20.0%
e 1
20.0%
Distinct11
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size7.5 MiB
2025-01-08T17:50:11.295052image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length7
Mean length6.92467928
Min length4

Characters and Unicode

Total characters6844353
Distinct characters27
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowSPECIES
2nd rowSPECIES
3rd rowSPECIES
4th rowSPECIES
5th rowSPECIES
ValueCountFrequency (%)
species 848247
85.8%
genus 60084
 
6.1%
variety 42962
 
4.3%
subspecies 20363
 
2.1%
family 5330
 
0.5%
kingdom 4747
 
0.5%
phylum 4695
 
0.5%
form 1401
 
0.1%
class 501
 
0.1%
order 69
 
< 0.1%
2025-01-08T17:50:11.394279image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 1840335
26.9%
S 1818669
26.6%
I 921649
13.5%
P 873305
12.8%
C 869111
12.7%
U 85142
 
1.2%
G 64831
 
0.9%
N 64831
 
0.9%
Y 52987
 
0.8%
A 48793
 
0.7%
Other values (17) 204700
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 6844346
> 99.9%
Decimal Number 7
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 1840335
26.9%
S 1818669
26.6%
I 921649
13.5%
P 873305
12.8%
C 869111
12.7%
U 85142
 
1.2%
G 64831
 
0.9%
N 64831
 
0.9%
Y 52987
 
0.8%
A 48793
 
0.7%
Other values (11) 204693
 
3.0%
Decimal Number
ValueCountFrequency (%)
2 2
28.6%
7 1
14.3%
9 1
14.3%
6 1
14.3%
1 1
14.3%
0 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 6844346
> 99.9%
Common 7
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 1840335
26.9%
S 1818669
26.6%
I 921649
13.5%
P 873305
12.8%
C 869111
12.7%
U 85142
 
1.2%
G 64831
 
0.9%
N 64831
 
0.9%
Y 52987
 
0.8%
A 48793
 
0.7%
Other values (11) 204693
 
3.0%
Common
ValueCountFrequency (%)
2 2
28.6%
7 1
14.3%
9 1
14.3%
6 1
14.3%
1 1
14.3%
0 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6844353
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 1840335
26.9%
S 1818669
26.6%
I 921649
13.5%
P 873305
12.8%
C 869111
12.7%
U 85142
 
1.2%
G 64831
 
0.9%
N 64831
 
0.9%
Y 52987
 
0.8%
A 48793
 
0.7%
Other values (17) 204700
 
3.0%

verbatimTaxonRank
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:11.436589image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row7296210
ValueCountFrequency (%)
7296210 1
100.0%
2025-01-08T17:50:11.525662image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2
28.6%
7 1
14.3%
9 1
14.3%
6 1
14.3%
1 1
14.3%
0 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2
28.6%
7 1
14.3%
9 1
14.3%
6 1
14.3%
1 1
14.3%
0 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Common 7
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2
28.6%
7 1
14.3%
9 1
14.3%
6 1
14.3%
1 1
14.3%
0 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2
28.6%
7 1
14.3%
9 1
14.3%
6 1
14.3%
1 1
14.3%
0 1
14.3%

vernacularName
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing988400
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:11.564664image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length4.5
Mean length4.5
Min length1

Characters and Unicode

Total characters9
Distinct characters8
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row6
2nd rowHOLOTYPE
ValueCountFrequency (%)
6 1
50.0%
holotype 1
50.0%
2025-01-08T17:50:11.660226image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
O 2
22.2%
6 1
11.1%
H 1
11.1%
L 1
11.1%
T 1
11.1%
Y 1
11.1%
P 1
11.1%
E 1
11.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 8
88.9%
Decimal Number 1
 
11.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O 2
25.0%
H 1
12.5%
L 1
12.5%
T 1
12.5%
Y 1
12.5%
P 1
12.5%
E 1
12.5%
Decimal Number
ValueCountFrequency (%)
6 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
88.9%
Common 1
 
11.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
O 2
25.0%
H 1
12.5%
L 1
12.5%
T 1
12.5%
Y 1
12.5%
P 1
12.5%
E 1
12.5%
Common
ValueCountFrequency (%)
6 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O 2
22.2%
6 1
11.1%
H 1
11.1%
L 1
11.1%
T 1
11.1%
Y 1
11.1%
P 1
11.1%
E 1
11.1%

nomenclaturalCode
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:11.700803image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row7707728
ValueCountFrequency (%)
7707728 1
100.0%
2025-01-08T17:50:11.791019image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 4
57.1%
0 1
 
14.3%
2 1
 
14.3%
8 1
 
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 4
57.1%
0 1
 
14.3%
2 1
 
14.3%
8 1
 
14.3%

Most occurring scripts

ValueCountFrequency (%)
Common 7
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
7 4
57.1%
0 1
 
14.3%
2 1
 
14.3%
8 1
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 4
57.1%
0 1
 
14.3%
2 1
 
14.3%
8 1
 
14.3%
Distinct4
Distinct (%)< 0.1%
Missing3368
Missing (%)0.3%
Memory size7.5 MiB
2025-01-08T17:50:11.834380image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.802124597
Min length3

Characters and Unicode

Total characters7685358
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowSYNONYM
2nd rowACCEPTED
3rd rowSYNONYM
4th rowACCEPTED
5th rowACCEPTED
ValueCountFrequency (%)
accepted 779014
79.1%
synonym 194909
 
19.8%
doubtful 11110
 
1.1%
220 1
 
< 0.1%
2025-01-08T17:50:11.925976image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 1558028
20.3%
C 1558028
20.3%
T 790124
10.3%
D 790124
10.3%
A 779014
10.1%
P 779014
10.1%
Y 389818
 
5.1%
N 389818
 
5.1%
O 206019
 
2.7%
S 194909
 
2.5%
Other values (7) 250462
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 7685355
> 99.9%
Decimal Number 3
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 1558028
20.3%
C 1558028
20.3%
T 790124
10.3%
D 790124
10.3%
A 779014
10.1%
P 779014
10.1%
Y 389818
 
5.1%
N 389818
 
5.1%
O 206019
 
2.7%
S 194909
 
2.5%
Other values (5) 250459
 
3.3%
Decimal Number
ValueCountFrequency (%)
2 2
66.7%
0 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 7685355
> 99.9%
Common 3
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 1558028
20.3%
C 1558028
20.3%
T 790124
10.3%
D 790124
10.3%
A 779014
10.1%
P 779014
10.1%
Y 389818
 
5.1%
N 389818
 
5.1%
O 206019
 
2.7%
S 194909
 
2.5%
Other values (5) 250459
 
3.3%
Common
ValueCountFrequency (%)
2 2
66.7%
0 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7685358
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 1558028
20.3%
C 1558028
20.3%
T 790124
10.3%
D 790124
10.3%
A 779014
10.1%
P 779014
10.1%
Y 389818
 
5.1%
N 389818
 
5.1%
O 206019
 
2.7%
S 194909
 
2.5%
Other values (7) 250462
 
3.3%

nomenclaturalStatus
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:11.969976image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters4
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row1414
ValueCountFrequency (%)
1414 1
100.0%
2025-01-08T17:50:12.056288image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2
50.0%
4 2
50.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2
50.0%
4 2
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2
50.0%
4 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2
50.0%
4 2
50.0%

taxonRemarks
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:12.095120image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters4
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row6631
ValueCountFrequency (%)
6631 1
100.0%
2025-01-08T17:50:12.182072image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 2
50.0%
3 1
25.0%
1 1
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 2
50.0%
3 1
25.0%
1 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 2
50.0%
3 1
25.0%
1 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 2
50.0%
3 1
25.0%
1 1
25.0%
Distinct2
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size7.5 MiB
2025-01-08T17:50:12.231073image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length36
Mean length35.99997066
Min length7

Characters and Unicode

Total characters35582371
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row821cc27a-e3bb-4bc5-ac34-89ada245069d
2nd row821cc27a-e3bb-4bc5-ac34-89ada245069d
3rd row821cc27a-e3bb-4bc5-ac34-89ada245069d
4th row821cc27a-e3bb-4bc5-ac34-89ada245069d
5th row821cc27a-e3bb-4bc5-ac34-89ada245069d
ValueCountFrequency (%)
821cc27a-e3bb-4bc5-ac34-89ada245069d 988399
> 99.9%
7296208 1
 
< 0.1%
2025-01-08T17:50:12.334909image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 3953596
11.1%
a 3953596
11.1%
- 3953596
11.1%
2 2965199
8.3%
b 2965197
8.3%
4 2965197
8.3%
8 1976799
 
5.6%
9 1976799
 
5.6%
3 1976798
 
5.6%
5 1976798
 
5.6%
Other values (6) 6918796
19.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 17791189
50.0%
Lowercase Letter 13837586
38.9%
Dash Punctuation 3953596
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2965199
16.7%
4 2965197
16.7%
8 1976799
11.1%
9 1976799
11.1%
3 1976798
11.1%
5 1976798
11.1%
7 988400
 
5.6%
0 988400
 
5.6%
6 988400
 
5.6%
1 988399
 
5.6%
Lowercase Letter
ValueCountFrequency (%)
c 3953596
28.6%
a 3953596
28.6%
b 2965197
21.4%
d 1976798
14.3%
e 988399
 
7.1%
Dash Punctuation
ValueCountFrequency (%)
- 3953596
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 21744785
61.1%
Latin 13837586
38.9%

Most frequent character per script

Common
ValueCountFrequency (%)
- 3953596
18.2%
2 2965199
13.6%
4 2965197
13.6%
8 1976799
9.1%
9 1976799
9.1%
3 1976798
9.1%
5 1976798
9.1%
7 988400
 
4.5%
0 988400
 
4.5%
6 988400
 
4.5%
Latin
ValueCountFrequency (%)
c 3953596
28.6%
a 3953596
28.6%
b 2965197
21.4%
d 1976798
14.3%
e 988399
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35582371
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 3953596
11.1%
a 3953596
11.1%
- 3953596
11.1%
2 2965199
8.3%
b 2965197
8.3%
4 2965197
8.3%
8 1976799
 
5.6%
9 1976799
 
5.6%
3 1976798
 
5.6%
5 1976798
 
5.6%
Other values (6) 6918796
19.4%

publishingCountry
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size7.5 MiB
2025-01-08T17:50:12.372954image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1976798
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUS
2nd rowUS
3rd rowUS
4th rowUS
5th rowUS
ValueCountFrequency (%)
us 988399
100.0%
2025-01-08T17:50:12.458894image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 988399
50.0%
S 988399
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1976798
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 988399
50.0%
S 988399
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1976798
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 988399
50.0%
S 988399
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1976798
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 988399
50.0%
S 988399
50.0%
Distinct200353
Distinct (%)20.3%
Missing2
Missing (%)< 0.1%
Memory size7.5 MiB
2025-01-08T17:50:12.596361image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99574565
Min length7

Characters and Unicode

Total characters23717395
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20728 ?
Unique (%)2.1%

Sample

1st row2024-12-02T13:59:14.452Z
2nd row2024-12-02T13:57:49.629Z
3rd row2024-12-02T13:57:49.533Z
4th row2024-12-02T13:59:17.370Z
5th row2024-12-02T13:59:30.710Z
ValueCountFrequency (%)
2024-12-02t13:56:52.667z 24
 
< 0.1%
2024-12-02t13:57:28.323z 24
 
< 0.1%
2024-12-02t13:57:53.831z 24
 
< 0.1%
2024-12-02t13:57:53.200z 23
 
< 0.1%
2024-12-02t13:57:24.579z 23
 
< 0.1%
2024-12-02t13:57:45.844z 23
 
< 0.1%
2024-12-02t13:57:43.276z 23
 
< 0.1%
2024-12-02t13:57:45.207z 23
 
< 0.1%
2024-12-02t13:57:50.630z 22
 
< 0.1%
2024-12-02t13:57:52.903z 22
 
< 0.1%
Other values (200343) 988169
> 99.9%
2025-01-08T17:50:12.791843image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 4511298
19.0%
0 2508199
10.6%
1 2493151
10.5%
- 1976798
8.3%
: 1976798
8.3%
4 1590376
 
6.7%
5 1570391
 
6.6%
3 1563965
 
6.6%
T 988399
 
4.2%
Z 988399
 
4.2%
Other values (5) 3549621
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16799649
70.8%
Other Punctuation 2964150
 
12.5%
Dash Punctuation 1976798
 
8.3%
Uppercase Letter 1976798
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 4511298
26.9%
0 2508199
14.9%
1 2493151
14.8%
4 1590376
 
9.5%
5 1570391
 
9.3%
3 1563965
 
9.3%
7 759695
 
4.5%
9 633974
 
3.8%
6 594334
 
3.5%
8 574266
 
3.4%
Other Punctuation
ValueCountFrequency (%)
: 1976798
66.7%
. 987352
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 988399
50.0%
Z 988399
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1976798
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 21740597
91.7%
Latin 1976798
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 4511298
20.8%
0 2508199
11.5%
1 2493151
11.5%
- 1976798
9.1%
: 1976798
9.1%
4 1590376
 
7.3%
5 1570391
 
7.2%
3 1563965
 
7.2%
. 987352
 
4.5%
7 759695
 
3.5%
Other values (3) 1802574
 
8.3%
Latin
ValueCountFrequency (%)
T 988399
50.0%
Z 988399
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23717395
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 4511298
19.0%
0 2508199
10.6%
1 2493151
10.5%
- 1976798
8.3%
: 1976798
8.3%
4 1590376
 
6.7%
5 1570391
 
6.6%
3 1563965
 
6.6%
T 988399
 
4.2%
Z 988399
 
4.2%
Other values (5) 3549621
15.0%

elevation
Text

Missing 

Distinct4953
Distinct (%)1.4%
Missing625728
Missing (%)63.3%
Memory size7.5 MiB
2025-01-08T17:50:12.984663image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length6
Mean length5.363784004
Min length3

Characters and Unicode

Total characters1945305
Distinct characters25
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1031 ?
Unique (%)0.3%

Sample

1st row2742.0
2nd row750.0
3rd row50.0
4th row225.0
5th row17.0
ValueCountFrequency (%)
1000.0 6075
 
1.7%
100.0 5877
 
1.6%
500.0 4957
 
1.4%
200.0 4795
 
1.3%
300.0 4744
 
1.3%
800.0 4519
 
1.2%
400.0 4320
 
1.2%
1500.0 4189
 
1.2%
1200.0 4187
 
1.2%
900.0 4104
 
1.1%
Other values (4927) 314908
86.8%
2025-01-08T17:50:13.237859image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 709378
36.5%
. 362673
18.6%
1 180866
 
9.3%
5 150898
 
7.8%
2 145612
 
7.5%
3 90804
 
4.7%
4 69684
 
3.6%
7 63645
 
3.3%
6 61615
 
3.2%
8 58985
 
3.0%
Other values (15) 51145
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1582558
81.4%
Other Punctuation 362673
 
18.6%
Dash Punctuation 57
 
< 0.1%
Lowercase Letter 15
 
< 0.1%
Uppercase Letter 1
 
< 0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 2
13.3%
o 2
13.3%
r 2
13.3%
a 2
13.3%
p 1
6.7%
f 1
6.7%
b 1
6.7%
u 1
6.7%
e 1
6.7%
n 1
6.7%
Decimal Number
ValueCountFrequency (%)
0 709378
44.8%
1 180866
 
11.4%
5 150898
 
9.5%
2 145612
 
9.2%
3 90804
 
5.7%
4 69684
 
4.4%
7 63645
 
4.0%
6 61615
 
3.9%
8 58985
 
3.7%
9 51071
 
3.2%
Other Punctuation
ValueCountFrequency (%)
. 362673
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 57
100.0%
Uppercase Letter
ValueCountFrequency (%)
R 1
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1945289
> 99.9%
Latin 16
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 709378
36.5%
. 362673
18.6%
1 180866
 
9.3%
5 150898
 
7.8%
2 145612
 
7.5%
3 90804
 
4.7%
4 69684
 
3.6%
7 63645
 
3.3%
6 61615
 
3.2%
8 58985
 
3.0%
Other values (3) 51129
 
2.6%
Latin
ValueCountFrequency (%)
i 2
12.5%
o 2
12.5%
r 2
12.5%
a 2
12.5%
p 1
6.2%
f 1
6.2%
b 1
6.2%
u 1
6.2%
R 1
6.2%
e 1
6.2%
Other values (2) 2
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1945305
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 709378
36.5%
. 362673
18.6%
1 180866
 
9.3%
5 150898
 
7.8%
2 145612
 
7.5%
3 90804
 
4.7%
4 69684
 
3.6%
7 63645
 
3.3%
6 61615
 
3.2%
8 58985
 
3.0%
Other values (15) 51145
 
2.6%

elevationAccuracy
Text

Missing 

Distinct858
Distinct (%)0.8%
Missing880635
Missing (%)89.1%
Memory size7.5 MiB
2025-01-08T17:50:13.410250image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length19
Mean length4.058227472
Min length3

Characters and Unicode

Total characters437343
Distinct characters25
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique278 ?
Unique (%)0.3%

Sample

1st row225.0
2nd row100.0
3rd row0.0
4th row0.0
5th row259.0
ValueCountFrequency (%)
0.0 25899
24.0%
50.0 12761
 
11.8%
100.0 8238
 
7.6%
150.0 5589
 
5.2%
25.0 5263
 
4.9%
75.0 3266
 
3.0%
200.0 3102
 
2.9%
152.5 2249
 
2.1%
10.0 1930
 
1.8%
250.0 1871
 
1.7%
Other values (850) 37602
34.9%
2025-01-08T17:50:13.631409image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 178346
40.8%
. 107766
24.6%
5 60739
 
13.9%
1 31294
 
7.2%
2 23940
 
5.5%
7 10221
 
2.3%
3 9561
 
2.2%
4 5552
 
1.3%
6 4732
 
1.1%
8 2655
 
0.6%
Other values (15) 2537
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 329546
75.4%
Other Punctuation 107766
 
24.6%
Lowercase Letter 27
 
< 0.1%
Space Separator 3
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 4
14.8%
a 4
14.8%
i 3
11.1%
o 3
11.1%
p 2
7.4%
u 2
7.4%
b 2
7.4%
f 2
7.4%
l 2
7.4%
e 1
 
3.7%
Other values (2) 2
7.4%
Decimal Number
ValueCountFrequency (%)
0 178346
54.1%
5 60739
 
18.4%
1 31294
 
9.5%
2 23940
 
7.3%
7 10221
 
3.1%
3 9561
 
2.9%
4 5552
 
1.7%
6 4732
 
1.4%
8 2655
 
0.8%
9 2506
 
0.8%
Other Punctuation
ValueCountFrequency (%)
. 107766
100.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Uppercase Letter
ValueCountFrequency (%)
R 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 437315
> 99.9%
Latin 28
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 4
14.3%
a 4
14.3%
i 3
10.7%
o 3
10.7%
p 2
7.1%
u 2
7.1%
b 2
7.1%
f 2
7.1%
l 2
7.1%
e 1
 
3.6%
Other values (3) 3
10.7%
Common
ValueCountFrequency (%)
0 178346
40.8%
. 107766
24.6%
5 60739
 
13.9%
1 31294
 
7.2%
2 23940
 
5.5%
7 10221
 
2.3%
3 9561
 
2.2%
4 5552
 
1.3%
6 4732
 
1.1%
8 2655
 
0.6%
Other values (2) 2509
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 437343
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 178346
40.8%
. 107766
24.6%
5 60739
 
13.9%
1 31294
 
7.2%
2 23940
 
5.5%
7 10221
 
2.3%
3 9561
 
2.2%
4 5552
 
1.3%
6 4732
 
1.1%
8 2655
 
0.6%
Other values (15) 2537
 
0.6%

depth
Text

Missing 

Distinct138
Distinct (%)1.6%
Missing979722
Missing (%)99.1%
Memory size7.5 MiB
2025-01-08T17:50:13.728627image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length4
Mean length3.671198157
Min length3

Characters and Unicode

Total characters31866
Distinct characters25
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44 ?
Unique (%)0.5%

Sample

1st row3.0
2nd row24.0
3rd row3.0
4th row12.0
5th row6.0
ValueCountFrequency (%)
12.0 1243
14.3%
18.0 1164
13.4%
6.0 1076
12.4%
24.0 937
10.8%
3.0 492
 
5.7%
43.0 404
 
4.7%
32.0 402
 
4.6%
1.5 287
 
3.3%
10.0 209
 
2.4%
13.0 171
 
2.0%
Other values (130) 2298
26.5%
2025-01-08T17:50:13.879003image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 8680
27.2%
0 7901
24.8%
1 3977
12.5%
2 2988
 
9.4%
4 1703
 
5.3%
3 1695
 
5.3%
5 1653
 
5.2%
6 1421
 
4.5%
8 1313
 
4.1%
7 341
 
1.1%
Other values (15) 194
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 23155
72.7%
Other Punctuation 8680
 
27.2%
Lowercase Letter 27
 
0.1%
Space Separator 3
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 4
14.8%
a 4
14.8%
i 3
11.1%
o 3
11.1%
p 2
7.4%
u 2
7.4%
b 2
7.4%
f 2
7.4%
l 2
7.4%
e 1
 
3.7%
Other values (2) 2
7.4%
Decimal Number
ValueCountFrequency (%)
0 7901
34.1%
1 3977
17.2%
2 2988
 
12.9%
4 1703
 
7.4%
3 1695
 
7.3%
5 1653
 
7.1%
6 1421
 
6.1%
8 1313
 
5.7%
7 341
 
1.5%
9 163
 
0.7%
Other Punctuation
ValueCountFrequency (%)
. 8680
100.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Uppercase Letter
ValueCountFrequency (%)
R 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 31838
99.9%
Latin 28
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 4
14.3%
a 4
14.3%
i 3
10.7%
o 3
10.7%
p 2
7.1%
u 2
7.1%
b 2
7.1%
f 2
7.1%
l 2
7.1%
e 1
 
3.6%
Other values (3) 3
10.7%
Common
ValueCountFrequency (%)
. 8680
27.3%
0 7901
24.8%
1 3977
12.5%
2 2988
 
9.4%
4 1703
 
5.3%
3 1695
 
5.3%
5 1653
 
5.2%
6 1421
 
4.5%
8 1313
 
4.1%
7 341
 
1.1%
Other values (2) 166
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31866
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 8680
27.2%
0 7901
24.8%
1 3977
12.5%
2 2988
 
9.4%
4 1703
 
5.3%
3 1695
 
5.3%
5 1653
 
5.2%
6 1421
 
4.5%
8 1313
 
4.1%
7 341
 
1.1%
Other values (15) 194
 
0.6%

depthAccuracy
Text

Missing 

Distinct38
Distinct (%)0.5%
Missing980482
Missing (%)99.2%
Memory size7.5 MiB
2025-01-08T17:50:13.937763image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length3
Mean length3.033964646
Min length3

Characters and Unicode

Total characters24029
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)0.2%

Sample

1st row0.0
2nd row3.0
3rd row0.0
4th row3.0
5th row3.0
ValueCountFrequency (%)
3.0 4303
54.3%
1.0 646
 
8.2%
1.5 562
 
7.1%
6.0 519
 
6.6%
0.0 430
 
5.4%
5.0 409
 
5.2%
2.5 208
 
2.6%
2.0 165
 
2.1%
4.5 141
 
1.8%
0.5 119
 
1.5%
Other values (28) 418
 
5.3%
2025-01-08T17:50:14.050343image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 7920
33.0%
0 7391
30.8%
3 4349
18.1%
5 1773
 
7.4%
1 1300
 
5.4%
6 546
 
2.3%
2 399
 
1.7%
4 205
 
0.9%
7 130
 
0.5%
8 9
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16109
67.0%
Other Punctuation 7920
33.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 7391
45.9%
3 4349
27.0%
5 1773
 
11.0%
1 1300
 
8.1%
6 546
 
3.4%
2 399
 
2.5%
4 205
 
1.3%
7 130
 
0.8%
8 9
 
0.1%
9 7
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 7920
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 24029
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 7920
33.0%
0 7391
30.8%
3 4349
18.1%
5 1773
 
7.4%
1 1300
 
5.4%
6 546
 
2.3%
2 399
 
1.7%
4 205
 
0.9%
7 130
 
0.5%
8 9
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24029
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 7920
33.0%
0 7391
30.8%
3 4349
18.1%
5 1773
 
7.4%
1 1300
 
5.4%
6 546
 
2.3%
2 399
 
1.7%
4 205
 
0.9%
7 130
 
0.5%
8 9
 
< 0.1%
Distinct268
Distinct (%)45.0%
Missing987807
Missing (%)99.9%
Memory size7.5 MiB
2025-01-08T17:50:14.167324image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length17
Mean length16.98151261
Min length3

Characters and Unicode

Total characters10104
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique175 ?
Unique (%)29.4%

Sample

1st row4008.084105458271
2nd row3618.177880660989
3rd row3836.5095124475733
4th row4578.201466648226
5th row4726.696371513394
ValueCountFrequency (%)
2015.7207067821585 45
 
7.6%
3318.235939960053 28
 
4.7%
3731.647014894624 17
 
2.9%
0.0 16
 
2.7%
2241.7609420453923 15
 
2.5%
365.55388600261153 11
 
1.8%
4225.163801327021 11
 
1.8%
4008.084105458271 10
 
1.7%
4819.432257301775 10
 
1.7%
4954.407240854524 8
 
1.3%
Other values (258) 424
71.3%
2025-01-08T17:50:14.355308image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 1043
10.3%
4 1043
10.3%
5 1013
10.0%
3 1003
9.9%
0 1000
9.9%
1 943
9.3%
8 927
9.2%
7 918
9.1%
6 832
8.2%
9 785
7.8%
Other values (4) 597
5.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9507
94.1%
Other Punctuation 594
 
5.9%
Uppercase Letter 3
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 1043
11.0%
4 1043
11.0%
5 1013
10.7%
3 1003
10.6%
0 1000
10.5%
1 943
9.9%
8 927
9.8%
7 918
9.7%
6 832
8.8%
9 785
8.3%
Uppercase Letter
ValueCountFrequency (%)
E 1
33.3%
M 1
33.3%
L 1
33.3%
Other Punctuation
ValueCountFrequency (%)
. 594
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10101
> 99.9%
Latin 3
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 1043
10.3%
4 1043
10.3%
5 1013
10.0%
3 1003
9.9%
0 1000
9.9%
1 943
9.3%
8 927
9.2%
7 918
9.1%
6 832
8.2%
9 785
7.8%
Latin
ValueCountFrequency (%)
E 1
33.3%
M 1
33.3%
L 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10104
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 1043
10.3%
4 1043
10.3%
5 1013
10.0%
3 1003
9.9%
0 1000
9.9%
1 943
9.3%
8 927
9.2%
7 918
9.1%
6 832
8.2%
9 785
7.8%
Other values (4) 597
5.9%

issue
Text

Distinct228
Distinct (%)< 0.1%
Missing101
Missing (%)< 0.1%
Memory size7.5 MiB
2025-01-08T17:50:14.426717image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length207
Median length48
Mean length55.97726907
Min length17

Characters and Unicode

Total characters55322391
Distinct characters39
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique42 ?
Unique (%)< 0.1%

Sample

1st rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84;CONTINENT_COORDINATE_MISMATCH
2nd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
3rd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
4th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
5th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
ValueCountFrequency (%)
occurrence_status_inferred_from_individual_count 746709
75.6%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84 122708
 
12.4%
occurrence_status_inferred_from_individual_count;taxon_match_higherrank 20378
 
2.1%
occurrence_status_inferred_from_individual_count;recorded_date_mismatch 20160
 
2.0%
occurrence_status_inferred_from_individual_count;continent_derived_from_country;continent_invalid 18108
 
1.8%
occurrence_status_inferred_from_individual_count;continent_country_mismatch 10576
 
1.1%
occurrence_status_inferred_from_individual_count;taxon_match_fuzzy 10400
 
1.1%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;geodetic_datum_invalid 4482
 
0.5%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;taxon_match_higherrank 4056
 
0.4%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_derived_from_coordinates;continent_invalid 3294
 
0.3%
Other values (218) 27430
 
2.8%
2025-01-08T17:50:14.568027image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 5687389
10.3%
R 5152841
9.3%
E 4657864
 
8.4%
I 4362875
 
7.9%
C 4341521
 
7.8%
N 4323540
 
7.8%
U 4305475
 
7.8%
T 3629406
 
6.6%
D 3603366
 
6.5%
O 3347835
 
6.1%
Other values (29) 11910279
21.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 49042028
88.6%
Connector Punctuation 5687389
 
10.3%
Other Punctuation 301057
 
0.5%
Decimal Number 291915
 
0.5%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 5152841
10.5%
E 4657864
9.5%
I 4362875
8.9%
C 4341521
8.9%
N 4323540
8.8%
U 4305475
8.8%
T 3629406
7.4%
D 3603366
7.3%
O 3347835
 
6.8%
A 2523853
 
5.1%
Other values (14) 8793452
17.9%
Decimal Number
ValueCountFrequency (%)
4 145950
50.0%
8 145949
50.0%
2 4
 
< 0.1%
0 3
 
< 0.1%
7 3
 
< 0.1%
1 2
 
< 0.1%
3 1
 
< 0.1%
5 1
 
< 0.1%
9 1
 
< 0.1%
6 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
; 301054
> 99.9%
: 2
 
< 0.1%
. 1
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 5687389
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 49042028
88.6%
Common 6280363
 
11.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 5152841
10.5%
E 4657864
9.5%
I 4362875
8.9%
C 4341521
8.9%
N 4323540
8.8%
U 4305475
8.8%
T 3629406
7.4%
D 3603366
7.3%
O 3347835
 
6.8%
A 2523853
 
5.1%
Other values (14) 8793452
17.9%
Common
ValueCountFrequency (%)
_ 5687389
90.6%
; 301054
 
4.8%
4 145950
 
2.3%
8 145949
 
2.3%
2 4
 
< 0.1%
0 3
 
< 0.1%
7 3
 
< 0.1%
: 2
 
< 0.1%
- 2
 
< 0.1%
1 2
 
< 0.1%
Other values (5) 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 55322391
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 5687389
10.3%
R 5152841
9.3%
E 4657864
 
8.4%
I 4362875
 
7.9%
C 4341521
 
7.8%
N 4323540
 
7.8%
U 4305475
 
7.8%
T 3629406
 
6.6%
D 3603366
 
6.5%
O 3347835
 
6.1%
Other values (29) 11910279
21.5%

mediaType
Text

Missing 

Distinct46
Distinct (%)< 0.1%
Missing69371
Missing (%)7.0%
Memory size7.5 MiB
2025-01-08T17:50:14.631365image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length571
Median length10
Mean length10.82667505
Min length10

Characters and Unicode

Total characters9950050
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)< 0.1%

Sample

1st rowStillImage
2nd rowStillImage
3rd rowStillImage
4th rowStillImage
5th rowStillImage
ValueCountFrequency (%)
stillimage 864575
94.1%
stillimage;stillimage 50344
 
5.5%
stillimage;stillimage;stillimage 1419
 
0.2%
stillimage;stillimage;stillimage;stillimage 997
 
0.1%
stillimage;stillimage;stillimage;stillimage;stillimage 485
 
0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 345
 
< 0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 233
 
< 0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 154
 
< 0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 88
 
< 0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 61
 
< 0.1%
Other values (36) 330
 
< 0.1%
2025-01-08T17:50:14.759322image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 1976192
19.9%
S 988096
9.9%
i 988096
9.9%
I 988096
9.9%
m 988096
9.9%
a 988096
9.9%
g 988096
9.9%
e 988096
9.9%
t 988096
9.9%
; 69066
 
0.7%
Other values (12) 24
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7904768
79.4%
Uppercase Letter 1976194
 
19.9%
Other Punctuation 69069
 
0.7%
Decimal Number 17
 
< 0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 1976192
25.0%
i 988096
12.5%
m 988096
12.5%
a 988096
12.5%
g 988096
12.5%
e 988096
12.5%
t 988096
12.5%
Decimal Number
ValueCountFrequency (%)
2 5
29.4%
1 4
23.5%
4 3
17.6%
0 2
 
11.8%
8 1
 
5.9%
3 1
 
5.9%
6 1
 
5.9%
Uppercase Letter
ValueCountFrequency (%)
S 988096
50.0%
I 988096
50.0%
T 1
 
< 0.1%
Z 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
; 69066
> 99.9%
: 2
 
< 0.1%
. 1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9880962
99.3%
Common 69088
 
0.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 1976192
20.0%
S 988096
10.0%
i 988096
10.0%
I 988096
10.0%
m 988096
10.0%
a 988096
10.0%
g 988096
10.0%
e 988096
10.0%
t 988096
10.0%
T 1
 
< 0.1%
Common
ValueCountFrequency (%)
; 69066
> 99.9%
2 5
 
< 0.1%
1 4
 
< 0.1%
4 3
 
< 0.1%
: 2
 
< 0.1%
0 2
 
< 0.1%
- 2
 
< 0.1%
8 1
 
< 0.1%
3 1
 
< 0.1%
. 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9950050
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 1976192
19.9%
S 988096
9.9%
i 988096
9.9%
I 988096
9.9%
m 988096
9.9%
a 988096
9.9%
g 988096
9.9%
e 988096
9.9%
t 988096
9.9%
; 69066
 
0.7%
Other values (12) 24
 
< 0.1%
Distinct3
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size7.5 MiB
2025-01-08T17:50:14.807322image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length34
Median length5
Mean length4.850901608
Min length4

Characters and Unicode

Total characters4794636
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowtrue
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 841002
85.1%
true 147398
 
14.9%
rollinia 1
 
< 0.1%
edulis 1
 
< 0.1%
var 1
 
< 0.1%
acuta 1
 
< 0.1%
r.e.fr 1
 
< 0.1%
2025-01-08T17:50:14.905140image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 988401
20.6%
a 841006
17.5%
l 841005
17.5%
s 841003
17.5%
f 841002
17.5%
r 147400
 
3.1%
u 147400
 
3.1%
t 147399
 
3.1%
. 4
 
< 0.1%
4
 
< 0.1%
Other values (9) 12
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4794624
> 99.9%
Other Punctuation 4
 
< 0.1%
Space Separator 4
 
< 0.1%
Uppercase Letter 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 988401
20.6%
a 841006
17.5%
l 841005
17.5%
s 841003
17.5%
f 841002
17.5%
r 147400
 
3.1%
u 147400
 
3.1%
t 147399
 
3.1%
i 3
 
< 0.1%
c 1
 
< 0.1%
Other values (4) 4
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
R 2
50.0%
E 1
25.0%
F 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 4
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4794628
> 99.9%
Common 8
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 988401
20.6%
a 841006
17.5%
l 841005
17.5%
s 841003
17.5%
f 841002
17.5%
r 147400
 
3.1%
u 147400
 
3.1%
t 147399
 
3.1%
i 3
 
< 0.1%
R 2
 
< 0.1%
Other values (7) 7
 
< 0.1%
Common
ValueCountFrequency (%)
. 4
50.0%
4
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4794636
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 988401
20.6%
a 841006
17.5%
l 841005
17.5%
s 841003
17.5%
f 841002
17.5%
r 147400
 
3.1%
u 147400
 
3.1%
t 147399
 
3.1%
. 4
 
< 0.1%
4
 
< 0.1%
Other values (9) 12
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size7.5 MiB
2025-01-08T17:50:14.945832image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.99511331
Min length4

Characters and Unicode

Total characters4937165
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 983569
99.5%
true 4830
 
0.5%
2025-01-08T17:50:15.041105image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 988399
20.0%
f 983569
19.9%
a 983569
19.9%
l 983569
19.9%
s 983569
19.9%
t 4830
 
0.1%
r 4830
 
0.1%
u 4830
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4937165
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 988399
20.0%
f 983569
19.9%
a 983569
19.9%
l 983569
19.9%
s 983569
19.9%
t 4830
 
0.1%
r 4830
 
0.1%
u 4830
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 4937165
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 988399
20.0%
f 983569
19.9%
a 983569
19.9%
l 983569
19.9%
s 983569
19.9%
t 4830
 
0.1%
r 4830
 
0.1%
u 4830
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4937165
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 988399
20.0%
f 983569
19.9%
a 983569
19.9%
l 983569
19.9%
s 983569
19.9%
t 4830
 
0.1%
r 4830
 
0.1%
u 4830
 
0.1%
Distinct171484
Distinct (%)17.3%
Missing3
Missing (%)< 0.1%
Memory size7.5 MiB
2025-01-08T17:50:15.252833image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.964352453
Min length1

Characters and Unicode

Total characters6883559
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique76155 ?
Unique (%)7.7%

Sample

1st row2654951
2nd row2947270
3rd row2765389
4th row3687053
5th row7355530
ValueCountFrequency (%)
8176985 3995
 
0.4%
0 3366
 
0.3%
2655370 1333
 
0.1%
6 1163
 
0.1%
3219107 1082
 
0.1%
5426909 1064
 
0.1%
5426949 994
 
0.1%
4270616 933
 
0.1%
2655497 809
 
0.1%
2654437 772
 
0.1%
Other values (171474) 972888
98.4%
2025-01-08T17:50:15.652955image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 890366
12.9%
3 810909
11.8%
5 728630
10.6%
7 726369
10.6%
0 644591
9.4%
6 635380
9.2%
8 634801
9.2%
1 619707
9.0%
9 605275
8.8%
4 587531
8.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6883559
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 890366
12.9%
3 810909
11.8%
5 728630
10.6%
7 726369
10.6%
0 644591
9.4%
6 635380
9.2%
8 634801
9.2%
1 619707
9.0%
9 605275
8.8%
4 587531
8.5%

Most occurring scripts

ValueCountFrequency (%)
Common 6883559
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 890366
12.9%
3 810909
11.8%
5 728630
10.6%
7 726369
10.6%
0 644591
9.4%
6 635380
9.2%
8 634801
9.2%
1 619707
9.0%
9 605275
8.8%
4 587531
8.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6883559
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 890366
12.9%
3 810909
11.8%
5 728630
10.6%
7 726369
10.6%
0 644591
9.4%
6 635380
9.2%
8 634801
9.2%
1 619707
9.0%
9 605275
8.8%
4 587531
8.5%
Distinct141149
Distinct (%)14.3%
Missing3368
Missing (%)0.3%
Memory size7.5 MiB
2025-01-08T17:50:15.878644image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.000687286
Min length1

Characters and Unicode

Total characters6895915
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52485 ?
Unique (%)5.3%

Sample

1st row2654944
2nd row2947270
3rd row10416230
4th row3687053
5th row7355530
ValueCountFrequency (%)
7947184 4001
 
0.4%
2655370 1415
 
0.1%
6 1163
 
0.1%
3219107 1082
 
0.1%
5426909 1064
 
0.1%
2702678 1008
 
0.1%
5426949 994
 
0.1%
2654909 868
 
0.1%
2655497 809
 
0.1%
5426932 760
 
0.1%
Other values (141139) 971870
98.7%
2025-01-08T17:50:16.161035image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 898604
13.0%
3 797962
11.6%
7 733166
10.6%
5 715553
10.4%
0 648866
9.4%
1 638677
9.3%
8 635333
9.2%
6 626367
9.1%
9 613562
8.9%
4 587820
8.5%
Other values (5) 5
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6895910
> 99.9%
Lowercase Letter 5
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 898604
13.0%
3 797962
11.6%
7 733166
10.6%
5 715553
10.4%
0 648866
9.4%
1 638677
9.3%
8 635333
9.2%
6 626367
9.1%
9 613562
8.9%
4 587820
8.5%
Lowercase Letter
ValueCountFrequency (%)
f 1
20.0%
a 1
20.0%
l 1
20.0%
s 1
20.0%
e 1
20.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6895910
> 99.9%
Latin 5
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 898604
13.0%
3 797962
11.6%
7 733166
10.6%
5 715553
10.4%
0 648866
9.4%
1 638677
9.3%
8 635333
9.2%
6 626367
9.1%
9 613562
8.9%
4 587820
8.5%
Latin
ValueCountFrequency (%)
f 1
20.0%
a 1
20.0%
l 1
20.0%
s 1
20.0%
e 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6895915
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 898604
13.0%
3 797962
11.6%
7 733166
10.6%
5 715553
10.4%
0 648866
9.4%
1 638677
9.3%
8 635333
9.2%
6 626367
9.1%
9 613562
8.9%
4 587820
8.5%
Other values (5) 5
 
< 0.1%
Distinct8
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size7.5 MiB
2025-01-08T17:50:16.218642image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length1
Mean length1.000012141
Min length1

Characters and Unicode

Total characters988412
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row6
2nd row6
3rd row6
4th row6
5th row6
ValueCountFrequency (%)
6 907311
91.8%
5 48945
 
5.0%
4 17041
 
1.7%
3 11701
 
1.2%
0 3366
 
0.3%
7 31
 
< 0.1%
1 4
 
< 0.1%
latin_america 1
 
< 0.1%
2025-01-08T17:50:16.311008image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 907311
91.8%
5 48945
 
5.0%
4 17041
 
1.7%
3 11701
 
1.2%
0 3366
 
0.3%
7 31
 
< 0.1%
1 4
 
< 0.1%
A 3
 
< 0.1%
I 2
 
< 0.1%
L 1
 
< 0.1%
Other values (7) 7
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 988399
> 99.9%
Uppercase Letter 12
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 3
25.0%
I 2
16.7%
L 1
 
8.3%
T 1
 
8.3%
N 1
 
8.3%
M 1
 
8.3%
E 1
 
8.3%
R 1
 
8.3%
C 1
 
8.3%
Decimal Number
ValueCountFrequency (%)
6 907311
91.8%
5 48945
 
5.0%
4 17041
 
1.7%
3 11701
 
1.2%
0 3366
 
0.3%
7 31
 
< 0.1%
1 4
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 988400
> 99.9%
Latin 12
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 3
25.0%
I 2
16.7%
L 1
 
8.3%
T 1
 
8.3%
N 1
 
8.3%
M 1
 
8.3%
E 1
 
8.3%
R 1
 
8.3%
C 1
 
8.3%
Common
ValueCountFrequency (%)
6 907311
91.8%
5 48945
 
5.0%
4 17041
 
1.7%
3 11701
 
1.2%
0 3366
 
0.3%
7 31
 
< 0.1%
1 4
 
< 0.1%
_ 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 988412
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 907311
91.8%
5 48945
 
5.0%
4 17041
 
1.7%
3 11701
 
1.2%
0 3366
 
0.3%
7 31
 
< 0.1%
1 4
 
< 0.1%
A 3
 
< 0.1%
I 2
 
< 0.1%
L 1
 
< 0.1%
Other values (7) 7
 
< 0.1%
Distinct24
Distinct (%)< 0.1%
Missing4754
Missing (%)0.5%
Memory size7.5 MiB
2025-01-08T17:50:16.357259image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length7
Mean length6.258306833
Min length1

Characters and Unicode

Total characters6155971
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st row106
2nd row7707728
3rd row7707728
4th row7707728
5th row7707728
ValueCountFrequency (%)
7707728 830617
84.4%
95 48276
 
4.9%
35 32695
 
3.3%
106 26385
 
2.7%
98 15149
 
1.5%
68 11694
 
1.2%
36 9268
 
0.9%
9 5937
 
0.6%
8770992 1887
 
0.2%
7819616 1126
 
0.1%
Other values (14) 614
 
0.1%
2025-01-08T17:50:16.453817image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 3327391
54.1%
8 860477
 
14.0%
0 858902
 
14.0%
2 832505
 
13.5%
5 80985
 
1.3%
9 74277
 
1.2%
6 49606
 
0.8%
3 42560
 
0.7%
1 28785
 
0.5%
4 470
 
< 0.1%
Other values (11) 13
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6155958
> 99.9%
Uppercase Letter 12
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 3327391
54.1%
8 860477
 
14.0%
0 858902
 
14.0%
2 832505
 
13.5%
5 80985
 
1.3%
9 74277
 
1.2%
6 49606
 
0.8%
3 42560
 
0.7%
1 28785
 
0.5%
4 470
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
A 2
16.7%
R 2
16.7%
I 1
8.3%
E 1
8.3%
M 1
8.3%
N 1
8.3%
H 1
8.3%
T 1
8.3%
O 1
8.3%
C 1
8.3%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6155959
> 99.9%
Latin 12
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
7 3327391
54.1%
8 860477
 
14.0%
0 858902
 
14.0%
2 832505
 
13.5%
5 80985
 
1.3%
9 74277
 
1.2%
6 49606
 
0.8%
3 42560
 
0.7%
1 28785
 
0.5%
4 470
 
< 0.1%
Latin
ValueCountFrequency (%)
A 2
16.7%
R 2
16.7%
I 1
8.3%
E 1
8.3%
M 1
8.3%
N 1
8.3%
H 1
8.3%
T 1
8.3%
O 1
8.3%
C 1
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6155971
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 3327391
54.1%
8 860477
 
14.0%
0 858902
 
14.0%
2 832505
 
13.5%
5 80985
 
1.3%
9 74277
 
1.2%
6 49606
 
0.8%
3 42560
 
0.7%
1 28785
 
0.5%
4 470
 
< 0.1%
Other values (11) 13
 
< 0.1%
Distinct68
Distinct (%)< 0.1%
Missing5481
Missing (%)0.6%
Memory size7.5 MiB
2025-01-08T17:50:16.518429image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.360726854
Min length3

Characters and Unicode

Total characters3303329
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)< 0.1%

Sample

1st row342
2nd row220
3rd row196
4th row220
5th row196
ValueCountFrequency (%)
220 565617
57.5%
196 199036
 
20.2%
7228684 54963
 
5.6%
180 44421
 
4.5%
327 29396
 
3.0%
342 25770
 
2.6%
10774316 11282
 
1.1%
7947184 8448
 
0.9%
195 8422
 
0.9%
7073593 6544
 
0.7%
Other values (58) 29022
 
3.0%
2025-01-08T17:50:16.628478image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 1314764
39.8%
0 633807
19.2%
1 302985
 
9.2%
6 272083
 
8.2%
9 235552
 
7.1%
8 166999
 
5.1%
7 141061
 
4.3%
4 123266
 
3.7%
3 88216
 
2.7%
5 24596
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3303329
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 1314764
39.8%
0 633807
19.2%
1 302985
 
9.2%
6 272083
 
8.2%
9 235552
 
7.1%
8 166999
 
5.1%
7 141061
 
4.3%
4 123266
 
3.7%
3 88216
 
2.7%
5 24596
 
0.7%

Most occurring scripts

ValueCountFrequency (%)
Common 3303329
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 1314764
39.8%
0 633807
19.2%
1 302985
 
9.2%
6 272083
 
8.2%
9 235552
 
7.1%
8 166999
 
5.1%
7 141061
 
4.3%
4 123266
 
3.7%
3 88216
 
2.7%
5 24596
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3303329
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 1314764
39.8%
0 633807
19.2%
1 302985
 
9.2%
6 272083
 
8.2%
9 235552
 
7.1%
8 166999
 
5.1%
7 141061
 
4.3%
4 123266
 
3.7%
3 88216
 
2.7%
5 24596
 
0.7%

orderKey
Text

Missing 

Distinct358
Distinct (%)< 0.1%
Missing10134
Missing (%)1.0%
Memory size7.5 MiB
2025-01-08T17:50:16.794555image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length68
Median length8
Mean length3.762241022
Min length3

Characters and Unicode

Total characters3680480
Distinct characters31
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39 ?
Unique (%)< 0.1%

Sample

1st row670
2nd row1370
3rd row553
4th row7224021
5th row1369
ValueCountFrequency (%)
1369 153750
 
15.7%
414 83320
 
8.5%
408 58318
 
6.0%
1370 55218
 
5.6%
1414 46323
 
4.7%
392 42295
 
4.3%
412 39541
 
4.0%
690 34933
 
3.6%
422 32482
 
3.3%
691 28326
 
2.9%
Other values (353) 403767
41.3%
2025-01-08T17:50:17.028041image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 767886
20.9%
4 547702
14.9%
3 458030
12.4%
9 405888
11.0%
6 375958
10.2%
2 317418
8.6%
0 298525
 
8.1%
7 202999
 
5.5%
5 169746
 
4.6%
8 136260
 
3.7%
Other values (21) 68
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3680412
> 99.9%
Lowercase Letter 52
 
< 0.1%
Space Separator 5
 
< 0.1%
Uppercase Letter 5
 
< 0.1%
Other Punctuation 4
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 10
19.2%
n 9
17.3%
e 8
15.4%
o 6
11.5%
l 5
9.6%
i 3
 
5.8%
s 2
 
3.8%
d 2
 
3.8%
t 2
 
3.8%
c 2
 
3.8%
Other values (3) 3
 
5.8%
Decimal Number
ValueCountFrequency (%)
1 767886
20.9%
4 547702
14.9%
3 458030
12.4%
9 405888
11.0%
6 375958
10.2%
2 317418
8.6%
0 298525
 
8.1%
7 202999
 
5.5%
5 169746
 
4.6%
8 136260
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
A 2
40.0%
D 1
20.0%
P 1
20.0%
M 1
20.0%
Space Separator
ValueCountFrequency (%)
5
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3680423
> 99.9%
Latin 57
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 10
17.5%
n 9
15.8%
e 8
14.0%
o 6
10.5%
l 5
8.8%
i 3
 
5.3%
s 2
 
3.5%
d 2
 
3.5%
t 2
 
3.5%
A 2
 
3.5%
Other values (7) 8
14.0%
Common
ValueCountFrequency (%)
1 767886
20.9%
4 547702
14.9%
3 458030
12.4%
9 405888
11.0%
6 375958
10.2%
2 317418
8.6%
0 298525
 
8.1%
7 202999
 
5.5%
5 169746
 
4.6%
8 136260
 
3.7%
Other values (4) 11
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3680480
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 767886
20.9%
4 547702
14.9%
3 458030
12.4%
9 405888
11.0%
6 375958
10.2%
2 317418
8.6%
0 298525
 
8.1%
7 202999
 
5.5%
5 169746
 
4.6%
8 136260
 
3.7%
Other values (21) 68
 
< 0.1%

familyKey
Text

Missing 

Distinct1293
Distinct (%)0.1%
Missing10432
Missing (%)1.1%
Memory size7.5 MiB
2025-01-08T17:50:17.211330image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length4
Mean length4.192720636
Min length4

Characters and Unicode

Total characters4100355
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique122 ?
Unique (%)< 0.1%

Sample

1st row4376199
2nd row5386
3rd row2763195
4th row6669
5th row3073
ValueCountFrequency (%)
3073 110118
 
11.3%
3065 78427
 
8.0%
5386 51638
 
5.3%
7708 30498
 
3.1%
8798 26201
 
2.7%
6683 16271
 
1.7%
6685 14761
 
1.5%
5015 14530
 
1.5%
8305 14370
 
1.5%
2497 13720
 
1.4%
Other values (1283) 607436
62.1%
2025-01-08T17:50:17.457928image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 656925
16.0%
3 648759
15.8%
7 486759
11.9%
8 420571
10.3%
0 419501
10.2%
5 351922
8.6%
2 334385
8.2%
4 300442
7.3%
9 244987
 
6.0%
1 236097
 
5.8%
Other values (6) 7
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4100348
> 99.9%
Lowercase Letter 6
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 656925
16.0%
3 648759
15.8%
7 486759
11.9%
8 420571
10.3%
0 419501
10.2%
5 351922
8.6%
2 334385
8.2%
4 300442
7.3%
9 244987
 
6.0%
1 236097
 
5.8%
Lowercase Letter
ValueCountFrequency (%)
a 2
33.3%
l 1
16.7%
n 1
16.7%
t 1
16.7%
e 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
P 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4100348
> 99.9%
Latin 7
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
6 656925
16.0%
3 648759
15.8%
7 486759
11.9%
8 420571
10.3%
0 419501
10.2%
5 351922
8.6%
2 334385
8.2%
4 300442
7.3%
9 244987
 
6.0%
1 236097
 
5.8%
Latin
ValueCountFrequency (%)
a 2
28.6%
P 1
14.3%
l 1
14.3%
n 1
14.3%
t 1
14.3%
e 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4100355
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 656925
16.0%
3 648759
15.8%
7 486759
11.9%
8 420571
10.3%
0 419501
10.2%
5 351922
8.6%
2 334385
8.2%
4 300442
7.3%
9 244987
 
6.0%
1 236097
 
5.8%
Other values (6) 7
 
< 0.1%

genusKey
Text

Missing 

Distinct14325
Distinct (%)1.5%
Missing15344
Missing (%)1.6%
Memory size7.5 MiB
2025-01-08T17:50:17.661772image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length7
Mean length7.02022901
Min length7

Characters and Unicode

Total characters6831090
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2132 ?
Unique (%)0.2%

Sample

1st row2654889
2nd row2947262
3rd row2763692
4th row3032531
5th row7822478
ValueCountFrequency (%)
2721893 12742
 
1.3%
3188558 8772
 
0.9%
2607519 6873
 
0.7%
2704173 6684
 
0.7%
2713455 6044
 
0.6%
2705540 5820
 
0.6%
2928997 5538
 
0.6%
2705322 5205
 
0.5%
2702537 4464
 
0.5%
2650583 4297
 
0.4%
Other values (14315) 906619
93.2%
2025-01-08T17:50:17.912071image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 1008029
14.8%
3 842317
12.3%
7 719389
10.5%
8 652757
9.6%
0 646719
9.5%
9 642294
9.4%
1 638239
9.3%
6 608888
8.9%
5 588567
8.6%
4 483879
7.1%
Other values (10) 12
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6831078
> 99.9%
Lowercase Letter 11
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 1008029
14.8%
3 842317
12.3%
7 719389
10.5%
8 652757
9.6%
0 646719
9.5%
9 642294
9.4%
1 638239
9.3%
6 608888
8.9%
5 588567
8.6%
4 483879
7.1%
Lowercase Letter
ValueCountFrequency (%)
a 2
18.2%
h 2
18.2%
o 1
9.1%
y 1
9.1%
p 1
9.1%
e 1
9.1%
c 1
9.1%
r 1
9.1%
t 1
9.1%
Uppercase Letter
ValueCountFrequency (%)
T 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6831078
> 99.9%
Latin 12
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 1008029
14.8%
3 842317
12.3%
7 719389
10.5%
8 652757
9.6%
0 646719
9.5%
9 642294
9.4%
1 638239
9.3%
6 608888
8.9%
5 588567
8.6%
4 483879
7.1%
Latin
ValueCountFrequency (%)
a 2
16.7%
h 2
16.7%
o 1
8.3%
y 1
8.3%
p 1
8.3%
T 1
8.3%
e 1
8.3%
c 1
8.3%
r 1
8.3%
t 1
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6831090
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 1008029
14.8%
3 842317
12.3%
7 719389
10.5%
8 652757
9.6%
0 646719
9.5%
9 642294
9.4%
1 638239
9.3%
6 608888
8.9%
5 588567
8.6%
4 483879
7.1%
Other values (10) 12
 
< 0.1%

subgenusKey
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:17.970570image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters13
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowMagnoliopsida
ValueCountFrequency (%)
magnoliopsida 1
100.0%
2025-01-08T17:50:18.060268image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2
15.4%
o 2
15.4%
i 2
15.4%
M 1
7.7%
g 1
7.7%
n 1
7.7%
l 1
7.7%
p 1
7.7%
s 1
7.7%
d 1
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12
92.3%
Uppercase Letter 1
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
16.7%
o 2
16.7%
i 2
16.7%
g 1
8.3%
n 1
8.3%
l 1
8.3%
p 1
8.3%
s 1
8.3%
d 1
8.3%
Uppercase Letter
ValueCountFrequency (%)
M 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
15.4%
o 2
15.4%
i 2
15.4%
M 1
7.7%
g 1
7.7%
n 1
7.7%
l 1
7.7%
p 1
7.7%
s 1
7.7%
d 1
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2
15.4%
o 2
15.4%
i 2
15.4%
M 1
7.7%
g 1
7.7%
n 1
7.7%
l 1
7.7%
p 1
7.7%
s 1
7.7%
d 1
7.7%

speciesKey
Text

Missing 

Distinct126812
Distinct (%)13.9%
Missing75442
Missing (%)7.6%
Memory size7.5 MiB
2025-01-08T17:50:18.272424image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length7
Mean length7.027005564
Min length7

Characters and Unicode

Total characters6415375
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique46507 ?
Unique (%)5.1%

Sample

1st row2654944
2nd row2947270
3rd row10416230
4th row3687053
5th row7355530
ValueCountFrequency (%)
2655370 1415
 
0.2%
3219107 1082
 
0.1%
5426909 1064
 
0.1%
2702678 1008
 
0.1%
5426949 994
 
0.1%
2704276 943
 
0.1%
2654909 868
 
0.1%
2655497 809
 
0.1%
5426932 760
 
0.1%
8225325 689
 
0.1%
Other values (126802) 903328
98.9%
2025-01-08T17:50:18.551414image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 829656
12.9%
3 752536
11.7%
5 677478
10.6%
7 670460
10.5%
0 607378
9.5%
8 599660
9.3%
1 599651
9.3%
9 568671
8.9%
6 565498
8.8%
4 544376
8.5%
Other values (9) 11
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6415364
> 99.9%
Lowercase Letter 10
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 829656
12.9%
3 752536
11.7%
5 677478
10.6%
7 670460
10.5%
0 607378
9.5%
8 599660
9.3%
1 599651
9.3%
9 568671
8.9%
6 565498
8.8%
4 544376
8.5%
Lowercase Letter
ValueCountFrequency (%)
a 2
20.0%
l 2
20.0%
g 1
10.0%
n 1
10.0%
o 1
10.0%
i 1
10.0%
e 1
10.0%
s 1
10.0%
Uppercase Letter
ValueCountFrequency (%)
M 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6415364
> 99.9%
Latin 11
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 829656
12.9%
3 752536
11.7%
5 677478
10.6%
7 670460
10.5%
0 607378
9.5%
8 599660
9.3%
1 599651
9.3%
9 568671
8.9%
6 565498
8.8%
4 544376
8.5%
Latin
ValueCountFrequency (%)
a 2
18.2%
l 2
18.2%
M 1
9.1%
g 1
9.1%
n 1
9.1%
o 1
9.1%
i 1
9.1%
e 1
9.1%
s 1
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6415375
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 829656
12.9%
3 752536
11.7%
5 677478
10.6%
7 670460
10.5%
0 607378
9.5%
8 599660
9.3%
1 599651
9.3%
9 568671
8.9%
6 565498
8.8%
4 544376
8.5%
Other values (9) 11
 
< 0.1%

species
Text

Missing 

Distinct126534
Distinct (%)13.9%
Missing75443
Missing (%)7.6%
Memory size7.5 MiB
2025-01-08T17:50:18.720431image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length32
Mean length18.99834275
Min length8

Characters and Unicode

Total characters17344708
Distinct characters54
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique46337 ?
Unique (%)5.1%

Sample

1st rowPhymatolithon calcareum
2nd rowAmicia glandulosa
3rd rowCallisia glandulosa
4th rowConnarus steyermarkii
5th rowTrichoneura grandiglumis
ValueCountFrequency (%)
carex 12516
 
0.7%
miconia 8270
 
0.5%
poa 6546
 
0.4%
cladonia 6511
 
0.4%
cyperus 5985
 
0.3%
paspalum 5640
 
0.3%
solanum 5444
 
0.3%
eragrostis 5024
 
0.3%
dichanthelium 4451
 
0.2%
asplenium 4181
 
0.2%
Other values (53483) 1761460
96.5%
2025-01-08T17:50:18.945797image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2129654
 
12.3%
i 1716826
 
9.9%
e 1157262
 
6.7%
r 1076152
 
6.2%
o 1063266
 
6.1%
s 1029428
 
5.9%
l 993778
 
5.7%
n 939467
 
5.4%
913069
 
5.3%
u 892693
 
5.1%
Other values (44) 5433113
31.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15514709
89.4%
Space Separator 913069
 
5.3%
Uppercase Letter 912981
 
5.3%
Dash Punctuation 3949
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2129654
13.7%
i 1716826
11.1%
e 1157262
 
7.5%
r 1076152
 
6.9%
o 1063266
 
6.9%
s 1029428
 
6.6%
l 993778
 
6.4%
n 939467
 
6.1%
u 892693
 
5.8%
t 778677
 
5.0%
Other values (16) 3737506
24.1%
Uppercase Letter
ValueCountFrequency (%)
C 127940
14.0%
P 122237
13.4%
S 92668
10.2%
A 81730
 
9.0%
M 60331
 
6.6%
E 52038
 
5.7%
L 45549
 
5.0%
D 43846
 
4.8%
B 37463
 
4.1%
H 37313
 
4.1%
Other values (16) 211866
23.2%
Space Separator
ValueCountFrequency (%)
913069
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3949
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16427690
94.7%
Common 917018
 
5.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2129654
13.0%
i 1716826
 
10.5%
e 1157262
 
7.0%
r 1076152
 
6.6%
o 1063266
 
6.5%
s 1029428
 
6.3%
l 993778
 
6.0%
n 939467
 
5.7%
u 892693
 
5.4%
t 778677
 
4.7%
Other values (42) 4650487
28.3%
Common
ValueCountFrequency (%)
913069
99.6%
- 3949
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17344708
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2129654
 
12.3%
i 1716826
 
9.9%
e 1157262
 
6.7%
r 1076152
 
6.2%
o 1063266
 
6.1%
s 1029428
 
5.9%
l 993778
 
5.7%
n 939467
 
5.4%
913069
 
5.3%
u 892693
 
5.1%
Other values (44) 5433113
31.3%
Distinct141148
Distinct (%)14.3%
Missing3368
Missing (%)0.3%
Memory size7.5 MiB
2025-01-08T17:50:19.142882image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length145
Median length98
Mean length31.85947389
Min length5

Characters and Unicode

Total characters31382665
Distinct characters124
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52484 ?
Unique (%)5.3%

Sample

1st rowPhymatolithon calcareum (Pallas) Adey & D.L.McKibbin
2nd rowAmicia glandulosa Kunth
3rd rowCallisia glandulosa (Seub.) Christenh. & Byng
4th rowConnarus steyermarkii Prance
5th rowTrichoneura grandiglumis (Nees) Ekman
ValueCountFrequency (%)
l 161894
 
4.2%
145790
 
3.8%
ex 72859
 
1.9%
var 29546
 
0.8%
subsp 28283
 
0.7%
kunth 26788
 
0.7%
dc 25626
 
0.7%
benth 22744
 
0.6%
a.gray 22225
 
0.6%
sw 20887
 
0.5%
Other values (67669) 3311496
85.6%
2025-01-08T17:50:19.411284image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2883104
 
9.2%
a 2801876
 
8.9%
i 2180333
 
6.9%
e 1948892
 
6.2%
r 1712734
 
5.5%
o 1536718
 
4.9%
l 1524818
 
4.9%
. 1445546
 
4.6%
n 1442074
 
4.6%
s 1413219
 
4.5%
Other values (114) 12493351
39.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 22800431
72.7%
Uppercase Letter 3092925
 
9.9%
Space Separator 2883104
 
9.2%
Other Punctuation 1639202
 
5.2%
Open Punctuation 417352
 
1.3%
Close Punctuation 417352
 
1.3%
Decimal Number 116016
 
0.4%
Dash Punctuation 13551
 
< 0.1%
Math Symbol 2707
 
< 0.1%
Connector Punctuation 25
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2801876
12.3%
i 2180333
 
9.6%
e 1948892
 
8.5%
r 1712734
 
7.5%
o 1536718
 
6.7%
l 1524818
 
6.7%
n 1442074
 
6.3%
s 1413219
 
6.2%
u 1243776
 
5.5%
t 1180783
 
5.2%
Other values (55) 5815208
25.5%
Uppercase Letter
ValueCountFrequency (%)
L 310532
 
10.0%
S 288516
 
9.3%
C 270821
 
8.8%
P 222781
 
7.2%
A 214829
 
6.9%
M 212413
 
6.9%
B 199884
 
6.5%
H 180134
 
5.8%
R 146849
 
4.7%
D 145970
 
4.7%
Other values (29) 900196
29.1%
Decimal Number
ValueCountFrequency (%)
1 33535
28.9%
8 23677
20.4%
9 15053
13.0%
0 7402
 
6.4%
3 7186
 
6.2%
2 7122
 
6.1%
7 6802
 
5.9%
4 5757
 
5.0%
6 5008
 
4.3%
5 4474
 
3.9%
Other Punctuation
ValueCountFrequency (%)
. 1445546
88.2%
& 145790
 
8.9%
, 46048
 
2.8%
' 1818
 
0.1%
Space Separator
ValueCountFrequency (%)
2883104
100.0%
Open Punctuation
ValueCountFrequency (%)
( 417352
100.0%
Close Punctuation
ValueCountFrequency (%)
) 417352
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 13551
100.0%
Math Symbol
ValueCountFrequency (%)
× 2707
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 25
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 25893356
82.5%
Common 5489309
 
17.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2801876
 
10.8%
i 2180333
 
8.4%
e 1948892
 
7.5%
r 1712734
 
6.6%
o 1536718
 
5.9%
l 1524818
 
5.9%
n 1442074
 
5.6%
s 1413219
 
5.5%
u 1243776
 
4.8%
t 1180783
 
4.6%
Other values (94) 8908133
34.4%
Common
ValueCountFrequency (%)
2883104
52.5%
. 1445546
26.3%
( 417352
 
7.6%
) 417352
 
7.6%
& 145790
 
2.7%
, 46048
 
0.8%
1 33535
 
0.6%
8 23677
 
0.4%
9 15053
 
0.3%
- 13551
 
0.2%
Other values (10) 48301
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31326567
99.8%
None 56098
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2883104
 
9.2%
a 2801876
 
8.9%
i 2180333
 
7.0%
e 1948892
 
6.2%
r 1712734
 
5.5%
o 1536718
 
4.9%
l 1524818
 
4.9%
. 1445546
 
4.6%
n 1442074
 
4.6%
s 1413219
 
4.5%
Other values (61) 12437253
39.7%
None
ValueCountFrequency (%)
ü 16171
28.8%
é 10022
17.9%
ö 8139
14.5%
ä 3690
 
6.6%
á 3643
 
6.5%
× 2707
 
4.8%
ø 1855
 
3.3%
Á 1839
 
3.3%
ó 1209
 
2.2%
è 874
 
1.6%
Other values (43) 5949
 
10.6%
Distinct177770
Distinct (%)18.0%
Missing3017
Missing (%)0.3%
Memory size7.5 MiB
2025-01-08T17:50:19.618751image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length125
Median length94
Mean length19.78091812
Min length6

Characters and Unicode

Total characters19491820
Distinct characters88
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique81124 ?
Unique (%)8.2%

Sample

1st rowLithothamnion calcareum
2nd rowAmicia glandulosa
3rd rowTripogandra glandulosa
4th rowConnarus steyermarkii
5th rowTrichoneura grandiglumis
ValueCountFrequency (%)
sp 59300
 
2.8%
var 45918
 
2.2%
subsp 23075
 
1.1%
carex 12732
 
0.6%
indet 9106
 
0.4%
poa 6687
 
0.3%
cyperus 6038
 
0.3%
cladonia 5900
 
0.3%
paspalum 5802
 
0.3%
miconia 5464
 
0.3%
Other values (64311) 1952155
91.6%
2025-01-08T17:50:19.891664image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2343786
 
12.0%
i 1854416
 
9.5%
e 1259869
 
6.5%
s 1219162
 
6.3%
r 1207808
 
6.2%
1146792
 
5.9%
o 1138492
 
5.8%
l 1071612
 
5.5%
n 1027785
 
5.3%
u 997253
 
5.1%
Other values (78) 6224845
31.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 17191337
88.2%
Space Separator 1146792
 
5.9%
Uppercase Letter 994154
 
5.1%
Other Punctuation 150624
 
0.8%
Dash Punctuation 4567
 
< 0.1%
Decimal Number 1516
 
< 0.1%
Open Punctuation 1412
 
< 0.1%
Close Punctuation 1412
 
< 0.1%
Math Symbol 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2343786
13.6%
i 1854416
10.8%
e 1259869
 
7.3%
s 1219162
 
7.1%
r 1207808
 
7.0%
o 1138492
 
6.6%
l 1071612
 
6.2%
n 1027785
 
6.0%
u 997253
 
5.8%
t 850651
 
4.9%
Other values (24) 4220503
24.6%
Uppercase Letter
ValueCountFrequency (%)
C 141099
14.2%
P 128999
13.0%
S 97726
9.8%
A 88710
 
8.9%
M 63577
 
6.4%
L 53431
 
5.4%
E 52795
 
5.3%
D 46397
 
4.7%
B 42740
 
4.3%
H 40423
 
4.1%
Other values (19) 238257
24.0%
Decimal Number
ValueCountFrequency (%)
2 489
32.3%
0 361
23.8%
1 328
21.6%
5 215
14.2%
7 31
 
2.0%
3 30
 
2.0%
9 29
 
1.9%
8 17
 
1.1%
6 12
 
0.8%
4 4
 
0.3%
Other Punctuation
ValueCountFrequency (%)
. 147866
98.2%
, 1090
 
0.7%
' 972
 
0.6%
& 510
 
0.3%
? 102
 
0.1%
" 42
 
< 0.1%
/ 40
 
< 0.1%
# 2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 1408
99.7%
[ 4
 
0.3%
Close Punctuation
ValueCountFrequency (%)
) 1408
99.7%
] 4
 
0.3%
Space Separator
ValueCountFrequency (%)
1146792
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4567
100.0%
Math Symbol
ValueCountFrequency (%)
× 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18185491
93.3%
Common 1306329
 
6.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2343786
12.9%
i 1854416
 
10.2%
e 1259869
 
6.9%
s 1219162
 
6.7%
r 1207808
 
6.6%
o 1138492
 
6.3%
l 1071612
 
5.9%
n 1027785
 
5.7%
u 997253
 
5.5%
t 850651
 
4.7%
Other values (53) 5214657
28.7%
Common
ValueCountFrequency (%)
1146792
87.8%
. 147866
 
11.3%
- 4567
 
0.3%
( 1408
 
0.1%
) 1408
 
0.1%
, 1090
 
0.1%
' 972
 
0.1%
& 510
 
< 0.1%
2 489
 
< 0.1%
0 361
 
< 0.1%
Other values (15) 866
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19491528
> 99.9%
None 292
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2343786
 
12.0%
i 1854416
 
9.5%
e 1259869
 
6.5%
s 1219162
 
6.3%
r 1207808
 
6.2%
1146792
 
5.9%
o 1138492
 
5.8%
l 1071612
 
5.5%
n 1027785
 
5.3%
u 997253
 
5.1%
Other values (66) 6224553
31.9%
None
ValueCountFrequency (%)
ë 174
59.6%
ü 27
 
9.2%
ö 25
 
8.6%
á 23
 
7.9%
é 12
 
4.1%
Á 11
 
3.8%
ó 7
 
2.4%
× 6
 
2.1%
É 4
 
1.4%
Ø 1
 
0.3%
Other values (2) 2
 
0.7%

protocol
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size7.5 MiB
2025-01-08T17:50:19.942019image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2965197
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEML
2nd rowEML
3rd rowEML
4th rowEML
5th rowEML
ValueCountFrequency (%)
eml 988399
100.0%
2025-01-08T17:50:20.029314image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 988399
33.3%
M 988399
33.3%
L 988399
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2965197
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 988399
33.3%
M 988399
33.3%
L 988399
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 2965197
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 988399
33.3%
M 988399
33.3%
L 988399
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2965197
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 988399
33.3%
M 988399
33.3%
L 988399
33.3%
Distinct200353
Distinct (%)20.3%
Missing2
Missing (%)< 0.1%
Memory size7.5 MiB
2025-01-08T17:50:20.157567image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99574464
Min length6

Characters and Unicode

Total characters23717394
Distinct characters19
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20728 ?
Unique (%)2.1%

Sample

1st row2024-12-02T13:59:14.452Z
2nd row2024-12-02T13:57:49.629Z
3rd row2024-12-02T13:57:49.533Z
4th row2024-12-02T13:59:17.370Z
5th row2024-12-02T13:59:30.710Z
ValueCountFrequency (%)
2024-12-02t13:56:52.667z 24
 
< 0.1%
2024-12-02t13:57:28.323z 24
 
< 0.1%
2024-12-02t13:57:53.831z 24
 
< 0.1%
2024-12-02t13:57:53.200z 23
 
< 0.1%
2024-12-02t13:57:24.579z 23
 
< 0.1%
2024-12-02t13:57:45.844z 23
 
< 0.1%
2024-12-02t13:57:43.276z 23
 
< 0.1%
2024-12-02t13:57:45.207z 23
 
< 0.1%
2024-12-02t13:57:50.630z 22
 
< 0.1%
2024-12-02t13:57:52.903z 22
 
< 0.1%
Other values (200343) 988169
> 99.9%
2025-01-08T17:50:20.358832image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 4511296
19.0%
0 2508198
10.6%
1 2493151
10.5%
- 1976798
8.3%
: 1976798
8.3%
4 1590376
 
6.7%
5 1570391
 
6.6%
3 1563965
 
6.6%
T 988399
 
4.2%
Z 988399
 
4.2%
Other values (9) 3549623
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16799642
70.8%
Other Punctuation 2964150
 
12.5%
Uppercase Letter 1976799
 
8.3%
Dash Punctuation 1976798
 
8.3%
Lowercase Letter 5
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 4511296
26.9%
0 2508198
14.9%
1 2493151
14.8%
4 1590376
 
9.5%
5 1570391
 
9.3%
3 1563965
 
9.3%
7 759694
 
4.5%
9 633972
 
3.8%
6 594333
 
3.5%
8 574266
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
T 988399
50.0%
Z 988399
50.0%
A 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
n 3
60.0%
o 1
 
20.0%
a 1
 
20.0%
Other Punctuation
ValueCountFrequency (%)
: 1976798
66.7%
. 987352
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 1976798
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 21740590
91.7%
Latin 1976804
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 4511296
20.8%
0 2508198
11.5%
1 2493151
11.5%
- 1976798
9.1%
: 1976798
9.1%
4 1590376
 
7.3%
5 1570391
 
7.2%
3 1563965
 
7.2%
. 987352
 
4.5%
7 759694
 
3.5%
Other values (3) 1802571
 
8.3%
Latin
ValueCountFrequency (%)
T 988399
50.0%
Z 988399
50.0%
n 3
 
< 0.1%
A 1
 
< 0.1%
o 1
 
< 0.1%
a 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23717394
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 4511296
19.0%
0 2508198
10.6%
1 2493151
10.5%
- 1976798
8.3%
: 1976798
8.3%
4 1590376
 
6.7%
5 1570391
 
6.6%
3 1563965
 
6.6%
T 988399
 
4.2%
Z 988399
 
4.2%
Other values (9) 3549623
15.0%
Distinct2
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size7.5 MiB
2025-01-08T17:50:20.416832image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99998381
Min length8

Characters and Unicode

Total characters23721584
Distinct characters18
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row2024-12-02T11:48:23.416Z
2nd row2024-12-02T11:48:23.416Z
3rd row2024-12-02T11:48:23.416Z
4th row2024-12-02T11:48:23.416Z
5th row2024-12-02T11:48:23.416Z
ValueCountFrequency (%)
2024-12-02t11:48:23.416z 988399
> 99.9%
rollinia 1
 
< 0.1%
2025-01-08T17:50:20.522184image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 4941995
20.8%
1 3953596
16.7%
4 2965197
12.5%
0 1976798
 
8.3%
- 1976798
 
8.3%
: 1976798
 
8.3%
Z 988399
 
4.2%
6 988399
 
4.2%
. 988399
 
4.2%
3 988399
 
4.2%
Other values (8) 1976806
8.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16802783
70.8%
Other Punctuation 2965197
 
12.5%
Uppercase Letter 1976799
 
8.3%
Dash Punctuation 1976798
 
8.3%
Lowercase Letter 7
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 4941995
29.4%
1 3953596
23.5%
4 2965197
17.6%
0 1976798
 
11.8%
6 988399
 
5.9%
3 988399
 
5.9%
8 988399
 
5.9%
Lowercase Letter
ValueCountFrequency (%)
l 2
28.6%
i 2
28.6%
o 1
14.3%
n 1
14.3%
a 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
Z 988399
50.0%
T 988399
50.0%
R 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
: 1976798
66.7%
. 988399
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 1976798
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 21744778
91.7%
Latin 1976806
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 4941995
22.7%
1 3953596
18.2%
4 2965197
13.6%
0 1976798
 
9.1%
- 1976798
 
9.1%
: 1976798
 
9.1%
6 988399
 
4.5%
. 988399
 
4.5%
3 988399
 
4.5%
8 988399
 
4.5%
Latin
ValueCountFrequency (%)
Z 988399
50.0%
T 988399
50.0%
l 2
 
< 0.1%
i 2
 
< 0.1%
R 1
 
< 0.1%
o 1
 
< 0.1%
n 1
 
< 0.1%
a 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23721584
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 4941995
20.8%
1 3953596
16.7%
4 2965197
12.5%
0 1976798
 
8.3%
- 1976798
 
8.3%
: 1976798
 
8.3%
Z 988399
 
4.2%
6 988399
 
4.2%
. 988399
 
4.2%
3 988399
 
4.2%
Other values (8) 1976806
8.3%
Distinct2
Distinct (%)< 0.1%
Missing9253
Missing (%)0.9%
Memory size7.5 MiB
2025-01-08T17:50:20.562182image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4.297422558
Min length4

Characters and Unicode

Total characters4207817
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowtrue
3rd rowtrue
4th rowtrue
5th rowtrue
ValueCountFrequency (%)
true 687928
70.3%
false 291221
29.7%
2025-01-08T17:50:20.653791image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 979149
23.3%
t 687928
16.3%
r 687928
16.3%
u 687928
16.3%
f 291221
 
6.9%
a 291221
 
6.9%
l 291221
 
6.9%
s 291221
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4207817
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 979149
23.3%
t 687928
16.3%
r 687928
16.3%
u 687928
16.3%
f 291221
 
6.9%
a 291221
 
6.9%
l 291221
 
6.9%
s 291221
 
6.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 4207817
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 979149
23.3%
t 687928
16.3%
r 687928
16.3%
u 687928
16.3%
f 291221
 
6.9%
a 291221
 
6.9%
l 291221
 
6.9%
s 291221
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4207817
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 979149
23.3%
t 687928
16.3%
r 687928
16.3%
u 687928
16.3%
f 291221
 
6.9%
a 291221
 
6.9%
l 291221
 
6.9%
s 291221
 
6.9%

projectId
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing988401
Missing (%)> 99.9%
Memory size7.5 MiB
2025-01-08T17:50:20.690953image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters6
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowedulis
ValueCountFrequency (%)
edulis 1
100.0%
2025-01-08T17:50:20.779423image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1
16.7%
d 1
16.7%
u 1
16.7%
l 1
16.7%
i 1
16.7%
s 1
16.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1
16.7%
d 1
16.7%
u 1
16.7%
l 1
16.7%
i 1
16.7%
s 1
16.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 6
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1
16.7%
d 1
16.7%
u 1
16.7%
l 1
16.7%
i 1
16.7%
s 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1
16.7%
d 1
16.7%
u 1
16.7%
l 1
16.7%
i 1
16.7%
s 1
16.7%
Distinct3
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size7.5 MiB
2025-01-08T17:50:20.820142image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.999925132
Min length4

Characters and Unicode

Total characters4941926
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 988325
> 99.9%
true 74
 
< 0.1%
acuta 1
 
< 0.1%
2025-01-08T17:50:20.912819image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 988399
20.0%
a 988327
20.0%
f 988325
20.0%
l 988325
20.0%
s 988325
20.0%
t 75
 
< 0.1%
u 75
 
< 0.1%
r 74
 
< 0.1%
c 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4941926
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 988399
20.0%
a 988327
20.0%
f 988325
20.0%
l 988325
20.0%
s 988325
20.0%
t 75
 
< 0.1%
u 75
 
< 0.1%
r 74
 
< 0.1%
c 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 4941926
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 988399
20.0%
a 988327
20.0%
f 988325
20.0%
l 988325
20.0%
s 988325
20.0%
t 75
 
< 0.1%
u 75
 
< 0.1%
r 74
 
< 0.1%
c 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4941926
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 988399
20.0%
a 988327
20.0%
f 988325
20.0%
l 988325
20.0%
s 988325
20.0%
t 75
 
< 0.1%
u 75
 
< 0.1%
r 74
 
< 0.1%
c 1
 
< 0.1%

gbifRegion
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing19586
Missing (%)2.0%
Memory size7.5 MiB
2025-01-08T17:50:20.959863image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length11.14384878
Min length4

Characters and Unicode

Total characters10796339
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNORTH_AMERICA
2nd rowLATIN_AMERICA
3rd rowLATIN_AMERICA
4th rowLATIN_AMERICA
5th rowAFRICA
ValueCountFrequency (%)
latin_america 416098
42.9%
north_america 317523
32.8%
asia 99994
 
10.3%
europe 56004
 
5.8%
oceania 44344
 
4.6%
africa 33918
 
3.5%
antarctica 935
 
0.1%
2025-01-08T17:50:21.061484image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 2242657
20.8%
I 1328910
12.3%
R 1142001
10.6%
E 889973
 
8.2%
C 813753
 
7.5%
N 778900
 
7.2%
T 735491
 
6.8%
_ 733621
 
6.8%
M 733621
 
6.8%
O 417871
 
3.9%
Other values (6) 979541
9.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 10062718
93.2%
Connector Punctuation 733621
 
6.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 2242657
22.3%
I 1328910
13.2%
R 1142001
11.3%
E 889973
 
8.8%
C 813753
 
8.1%
N 778900
 
7.7%
T 735491
 
7.3%
M 733621
 
7.3%
O 417871
 
4.2%
L 416098
 
4.1%
Other values (5) 563443
 
5.6%
Connector Punctuation
ValueCountFrequency (%)
_ 733621
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10062718
93.2%
Common 733621
 
6.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 2242657
22.3%
I 1328910
13.2%
R 1142001
11.3%
E 889973
 
8.8%
C 813753
 
8.1%
N 778900
 
7.7%
T 735491
 
7.3%
M 733621
 
7.3%
O 417871
 
4.2%
L 416098
 
4.1%
Other values (5) 563443
 
5.6%
Common
ValueCountFrequency (%)
_ 733621
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10796339
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 2242657
20.8%
I 1328910
12.3%
R 1142001
10.6%
E 889973
 
8.2%
C 813753
 
7.5%
N 778900
 
7.2%
T 735491
 
6.8%
_ 733621
 
6.8%
M 733621
 
6.8%
O 417871
 
3.9%
Other values (6) 979541
9.1%
Distinct2
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size7.5 MiB
2025-01-08T17:50:21.107485image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length12.99999393
Min length7

Characters and Unicode

Total characters12849194
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
3rd rowNORTH_AMERICA
4th rowNORTH_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 988399
> 99.9%
variety 1
 
< 0.1%
2025-01-08T17:50:21.201194image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 1976799
15.4%
A 1976799
15.4%
T 988400
7.7%
E 988400
7.7%
I 988400
7.7%
N 988399
7.7%
O 988399
7.7%
H 988399
7.7%
_ 988399
7.7%
M 988399
7.7%
Other values (3) 988401
7.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 11860795
92.3%
Connector Punctuation 988399
 
7.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 1976799
16.7%
A 1976799
16.7%
T 988400
8.3%
E 988400
8.3%
I 988400
8.3%
N 988399
8.3%
O 988399
8.3%
H 988399
8.3%
M 988399
8.3%
C 988399
8.3%
Other values (2) 2
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 988399
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11860795
92.3%
Common 988399
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 1976799
16.7%
A 1976799
16.7%
T 988400
8.3%
E 988400
8.3%
I 988400
8.3%
N 988399
8.3%
O 988399
8.3%
H 988399
8.3%
M 988399
8.3%
C 988399
8.3%
Other values (2) 2
 
< 0.1%
Common
ValueCountFrequency (%)
_ 988399
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12849194
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 1976799
15.4%
A 1976799
15.4%
T 988400
7.7%
E 988400
7.7%
I 988400
7.7%
N 988399
7.7%
O 988399
7.7%
H 988399
7.7%
_ 988399
7.7%
M 988399
7.7%
Other values (3) 988401
7.7%

level0Gid
Text

Missing 

Distinct195
Distinct (%)0.1%
Missing854767
Missing (%)86.5%
Memory size7.5 MiB
2025-01-08T17:50:21.337281image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters400905
Distinct characters29
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)< 0.1%

Sample

1st rowDOM
2nd rowCHL
3rd rowGUY
4th rowUSA
5th rowBRA
ValueCountFrequency (%)
usa 23761
17.8%
guy 14629
 
10.9%
bra 11793
 
8.8%
mex 10743
 
8.0%
ven 10534
 
7.9%
ecu 6689
 
5.0%
guf 4499
 
3.4%
bol 4471
 
3.3%
per 4388
 
3.3%
col 3749
 
2.8%
Other values (185) 38379
28.7%
2025-01-08T17:50:21.531808image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 55471
13.8%
A 49231
12.3%
E 33716
 
8.4%
S 30480
 
7.6%
R 25143
 
6.3%
G 24739
 
6.2%
N 22898
 
5.7%
C 20842
 
5.2%
B 18493
 
4.6%
M 17528
 
4.4%
Other values (19) 102364
25.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 400897
> 99.9%
Decimal Number 8
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 55471
13.8%
A 49231
12.3%
E 33716
 
8.4%
S 30480
 
7.6%
R 25143
 
6.3%
G 24739
 
6.2%
N 22898
 
5.7%
C 20842
 
5.2%
B 18493
 
4.6%
M 17528
 
4.4%
Other values (16) 102356
25.5%
Decimal Number
ValueCountFrequency (%)
0 4
50.0%
7 3
37.5%
6 1
 
12.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 400897
> 99.9%
Common 8
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 55471
13.8%
A 49231
12.3%
E 33716
 
8.4%
S 30480
 
7.6%
R 25143
 
6.3%
G 24739
 
6.2%
N 22898
 
5.7%
C 20842
 
5.2%
B 18493
 
4.6%
M 17528
 
4.4%
Other values (16) 102356
25.5%
Common
ValueCountFrequency (%)
0 4
50.0%
7 3
37.5%
6 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 400905
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 55471
13.8%
A 49231
12.3%
E 33716
 
8.4%
S 30480
 
7.6%
R 25143
 
6.3%
G 24739
 
6.2%
N 22898
 
5.7%
C 20842
 
5.2%
B 18493
 
4.6%
M 17528
 
4.4%
Other values (19) 102364
25.5%

level0Name
Text

Missing 

Distinct195
Distinct (%)0.1%
Missing854767
Missing (%)86.5%
Memory size7.5 MiB
2025-01-08T17:50:21.701514image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length27
Mean length8.588745463
Min length4

Characters and Unicode

Total characters1147757
Distinct characters62
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)< 0.1%

Sample

1st rowDominican Republic
2nd rowChile
3rd rowGuyana
4th rowUnited States
5th rowBrazil
ValueCountFrequency (%)
united 23809
13.7%
states 23778
13.7%
guyana 14629
 
8.4%
brazil 11793
 
6.8%
méxico 10743
 
6.2%
venezuela 10534
 
6.1%
ecuador 6689
 
3.9%
french 5197
 
3.0%
guiana 4499
 
2.6%
bolivia 4471
 
2.6%
Other values (224) 57483
33.1%
2025-01-08T17:50:21.931860image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 163858
14.3%
e 105662
 
9.2%
i 94961
 
8.3%
n 84634
 
7.4%
t 81902
 
7.1%
u 56381
 
4.9%
r 43581
 
3.8%
o 42775
 
3.7%
39990
 
3.5%
l 39687
 
3.5%
Other values (52) 394326
34.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 933880
81.4%
Uppercase Letter 173438
 
15.1%
Space Separator 39990
 
3.5%
Other Punctuation 441
 
< 0.1%
Dash Punctuation 4
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 163858
17.5%
e 105662
11.3%
i 94961
10.2%
n 84634
9.1%
t 81902
8.8%
u 56381
 
6.0%
r 43581
 
4.7%
o 42775
 
4.6%
l 39687
 
4.2%
d 38966
 
4.2%
Other values (21) 181473
19.4%
Uppercase Letter
ValueCountFrequency (%)
S 29799
17.2%
U 24119
13.9%
G 21942
12.7%
B 17299
10.0%
M 13763
7.9%
C 13726
7.9%
V 10908
 
6.3%
P 10111
 
5.8%
E 7473
 
4.3%
F 5341
 
3.1%
Other values (14) 18957
10.9%
Other Punctuation
ValueCountFrequency (%)
. 192
43.5%
, 168
38.1%
' 81
18.4%
Space Separator
ValueCountFrequency (%)
39990
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1107318
96.5%
Common 40439
 
3.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 163858
14.8%
e 105662
 
9.5%
i 94961
 
8.6%
n 84634
 
7.6%
t 81902
 
7.4%
u 56381
 
5.1%
r 43581
 
3.9%
o 42775
 
3.9%
l 39687
 
3.6%
d 38966
 
3.5%
Other values (45) 354911
32.1%
Common
ValueCountFrequency (%)
39990
98.9%
. 192
 
0.5%
, 168
 
0.4%
' 81
 
0.2%
- 4
 
< 0.1%
( 2
 
< 0.1%
) 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1136834
99.0%
None 10923
 
1.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 163858
14.4%
e 105662
 
9.3%
i 94961
 
8.4%
n 84634
 
7.4%
t 81902
 
7.2%
u 56381
 
5.0%
r 43581
 
3.8%
o 42775
 
3.8%
39990
 
3.5%
l 39687
 
3.5%
Other values (47) 383403
33.7%
None
ValueCountFrequency (%)
é 10760
98.5%
ô 81
 
0.7%
ç 68
 
0.6%
ã 7
 
0.1%
í 7
 
0.1%

level1Gid
Text

Missing 

Distinct1703
Distinct (%)1.3%
Missing855021
Missing (%)86.5%
Memory size7.5 MiB
2025-01-08T17:50:22.129856image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.447372564
Min length6

Characters and Unicode

Total characters993338
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique338 ?
Unique (%)0.3%

Sample

1st rowDOM.26_1
2nd rowCHL.6_1
3rd rowGUY.2_1
4th rowUSA.47_1
5th rowBRA.1_1
ValueCountFrequency (%)
usa.21_1 4566
 
3.4%
guy.8_1 4001
 
3.0%
usa.47_1 3961
 
3.0%
guy.10_1 3952
 
3.0%
guy.2_1 3604
 
2.7%
ven.1_1 3448
 
2.6%
usa.9_1 3286
 
2.5%
ven.6_1 3215
 
2.4%
guf.1_1 2886
 
2.2%
usa.2_1 2677
 
2.0%
Other values (1693) 97785
73.3%
2025-01-08T17:50:22.491932image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 178102
17.9%
_ 133380
13.4%
. 133323
13.4%
U 55403
 
5.6%
A 48963
 
4.9%
2 44340
 
4.5%
E 33716
 
3.4%
S 30480
 
3.1%
R 25125
 
2.5%
G 24740
 
2.5%
Other values (28) 285766
28.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 400138
40.3%
Decimal Number 326497
32.9%
Connector Punctuation 133380
 
13.4%
Other Punctuation 133323
 
13.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 55403
13.8%
A 48963
12.2%
E 33716
 
8.4%
S 30480
 
7.6%
R 25125
 
6.3%
G 24740
 
6.2%
N 22866
 
5.7%
C 20750
 
5.2%
B 18493
 
4.6%
M 17528
 
4.4%
Other values (16) 102074
25.5%
Decimal Number
ValueCountFrequency (%)
1 178102
54.5%
2 44340
 
13.6%
4 18773
 
5.7%
3 15053
 
4.6%
9 12569
 
3.8%
6 12395
 
3.8%
5 12323
 
3.8%
8 12169
 
3.7%
0 11022
 
3.4%
7 9751
 
3.0%
Connector Punctuation
ValueCountFrequency (%)
_ 133380
100.0%
Other Punctuation
ValueCountFrequency (%)
. 133323
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 593200
59.7%
Latin 400138
40.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 55403
13.8%
A 48963
12.2%
E 33716
 
8.4%
S 30480
 
7.6%
R 25125
 
6.3%
G 24740
 
6.2%
N 22866
 
5.7%
C 20750
 
5.2%
B 18493
 
4.6%
M 17528
 
4.4%
Other values (16) 102074
25.5%
Common
ValueCountFrequency (%)
1 178102
30.0%
_ 133380
22.5%
. 133323
22.5%
2 44340
 
7.5%
4 18773
 
3.2%
3 15053
 
2.5%
9 12569
 
2.1%
6 12395
 
2.1%
5 12323
 
2.1%
8 12169
 
2.1%
Other values (2) 20773
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 993338
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 178102
17.9%
_ 133380
13.4%
. 133323
13.4%
U 55403
 
5.6%
A 48963
 
4.9%
2 44340
 
4.5%
E 33716
 
3.4%
S 30480
 
3.1%
R 25125
 
2.5%
G 24740
 
2.5%
Other values (28) 285766
28.8%

level1Name
Text

Missing 

Distinct1634
Distinct (%)1.2%
Missing855020
Missing (%)86.5%
Memory size7.5 MiB
2025-01-08T17:50:22.669870image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length29
Mean length10.18645694
Min length3

Characters and Unicode

Total characters1358690
Distinct characters120
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique323 ?
Unique (%)0.2%

Sample

1st rowSan Juan
2nd rowBío-Bío
3rd rowCuyuni-Mazaruni
4th rowVirginia
5th rowAcre
ValueCountFrequency (%)
amazonas 5737
 
3.2%
upper 4920
 
2.7%
maryland 4566
 
2.5%
essequibo 4171
 
2.3%
potaro-siparuni 4001
 
2.2%
virginia 3991
 
2.2%
takutu-upper 3952
 
2.2%
columbia 3813
 
2.1%
cuyuni-mazaruni 3604
 
2.0%
district 3288
 
1.8%
Other values (1775) 138660
76.7%
2025-01-08T17:50:22.910726image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 199066
14.7%
i 109009
 
8.0%
r 94704
 
7.0%
n 89929
 
6.6%
o 85891
 
6.3%
e 69436
 
5.1%
u 67791
 
5.0%
s 50157
 
3.7%
t 48416
 
3.6%
47321
 
3.5%
Other values (110) 496970
36.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1094781
80.6%
Uppercase Letter 194205
 
14.3%
Space Separator 47321
 
3.5%
Dash Punctuation 21841
 
1.6%
Other Punctuation 532
 
< 0.1%
Modifier Symbol 6
 
< 0.1%
Close Punctuation 2
 
< 0.1%
Open Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 199066
18.2%
i 109009
10.0%
r 94704
 
8.7%
n 89929
 
8.2%
o 85891
 
7.8%
e 69436
 
6.3%
u 67791
 
6.2%
s 50157
 
4.6%
t 48416
 
4.4%
l 43442
 
4.0%
Other values (68) 236940
21.6%
Uppercase Letter
ValueCountFrequency (%)
C 26555
13.7%
M 21196
10.9%
S 19522
 
10.1%
A 15466
 
8.0%
P 14898
 
7.7%
B 12281
 
6.3%
T 9818
 
5.1%
U 9558
 
4.9%
D 7694
 
4.0%
N 7614
 
3.9%
Other values (22) 49603
25.5%
Other Punctuation
ValueCountFrequency (%)
' 208
39.1%
! 144
27.1%
. 122
22.9%
, 57
 
10.7%
/ 1
 
0.2%
Space Separator
ValueCountFrequency (%)
47321
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 21841
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 6
100.0%
Close Punctuation
ValueCountFrequency (%)
] 2
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1288986
94.9%
Common 69704
 
5.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 199066
15.4%
i 109009
 
8.5%
r 94704
 
7.3%
n 89929
 
7.0%
o 85891
 
6.7%
e 69436
 
5.4%
u 67791
 
5.3%
s 50157
 
3.9%
t 48416
 
3.8%
l 43442
 
3.4%
Other values (100) 431145
33.4%
Common
ValueCountFrequency (%)
47321
67.9%
- 21841
31.3%
' 208
 
0.3%
! 144
 
0.2%
. 122
 
0.2%
, 57
 
0.1%
` 6
 
< 0.1%
] 2
 
< 0.1%
[ 2
 
< 0.1%
/ 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1339717
98.6%
None 18880
 
1.4%
Latin Ext Additional 93
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 199066
14.9%
i 109009
 
8.1%
r 94704
 
7.1%
n 89929
 
6.7%
o 85891
 
6.4%
e 69436
 
5.2%
u 67791
 
5.1%
s 50157
 
3.7%
t 48416
 
3.6%
47321
 
3.5%
Other values (52) 477997
35.7%
None
ValueCountFrequency (%)
í 6177
32.7%
á 4744
25.1%
é 3338
17.7%
ó 1150
 
6.1%
ã 984
 
5.2%
Î 684
 
3.6%
ô 454
 
2.4%
ñ 381
 
2.0%
Ñ 252
 
1.3%
ö 121
 
0.6%
Other values (37) 595
 
3.2%
Latin Ext Additional
ValueCountFrequency (%)
25
26.9%
22
23.7%
18
19.4%
9
 
9.7%
5
 
5.4%
4
 
4.3%
3
 
3.2%
3
 
3.2%
2
 
2.2%
ế 1
 
1.1%

level2Gid
Text

Missing 

Distinct7917
Distinct (%)6.1%
Missing859029
Missing (%)86.9%
Memory size7.5 MiB
2025-01-08T17:50:23.106178image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length11
Mean length9.905212061
Min length8

Characters and Unicode

Total characters1281467
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2525 ?
Unique (%)2.0%

Sample

1st rowDOM.26.2_1
2nd rowCHL.6.3_1
3rd rowGUY.2.5_1
4th rowUSA.47.8_1
5th rowBRA.1.11_2
ValueCountFrequency (%)
usa.9.1_1 3286
 
2.5%
guy.8.8_1 3032
 
2.3%
guy.2.8_1 2312
 
1.8%
guy.10.4_1 2189
 
1.7%
usa.21.15_1 1956
 
1.5%
usa.21.16_1 1386
 
1.1%
ven.6.5_1 1255
 
1.0%
ven.1.7_1 1231
 
1.0%
usa.47.102_1 1092
 
0.8%
usa.2.17_1 1081
 
0.8%
Other values (7907) 110553
85.5%
2025-01-08T17:50:23.359209image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 258687
20.2%
1 192666
15.0%
_ 129373
 
10.1%
2 95635
 
7.5%
U 55286
 
4.3%
A 48244
 
3.8%
4 37638
 
2.9%
E 33642
 
2.6%
3 32693
 
2.6%
S 29992
 
2.3%
Other values (28) 367611
28.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 505296
39.4%
Uppercase Letter 388111
30.3%
Other Punctuation 258687
20.2%
Connector Punctuation 129373
 
10.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 55286
14.2%
A 48244
12.4%
E 33642
 
8.7%
S 29992
 
7.7%
G 24307
 
6.3%
R 23607
 
6.1%
N 22823
 
5.9%
C 20618
 
5.3%
B 17749
 
4.6%
M 16726
 
4.3%
Other values (16) 95117
24.5%
Decimal Number
ValueCountFrequency (%)
1 192666
38.1%
2 95635
18.9%
4 37638
 
7.4%
3 32693
 
6.5%
5 29765
 
5.9%
8 28434
 
5.6%
6 25444
 
5.0%
7 21723
 
4.3%
9 21112
 
4.2%
0 20186
 
4.0%
Other Punctuation
ValueCountFrequency (%)
. 258687
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 129373
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 893356
69.7%
Latin 388111
30.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 55286
14.2%
A 48244
12.4%
E 33642
 
8.7%
S 29992
 
7.7%
G 24307
 
6.3%
R 23607
 
6.1%
N 22823
 
5.9%
C 20618
 
5.3%
B 17749
 
4.6%
M 16726
 
4.3%
Other values (16) 95117
24.5%
Common
ValueCountFrequency (%)
. 258687
29.0%
1 192666
21.6%
_ 129373
14.5%
2 95635
 
10.7%
4 37638
 
4.2%
3 32693
 
3.7%
5 29765
 
3.3%
8 28434
 
3.2%
6 25444
 
2.8%
7 21723
 
2.4%
Other values (2) 41298
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1281467
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 258687
20.2%
1 192666
15.0%
_ 129373
 
10.1%
2 95635
 
7.5%
U 55286
 
4.3%
A 48244
 
3.8%
4 37638
 
2.9%
E 33642
 
2.6%
3 32693
 
2.6%
S 29992
 
2.3%
Other values (28) 367611
28.7%

level2Name
Text

Missing 

Distinct7281
Distinct (%)5.6%
Missing859040
Missing (%)86.9%
Memory size7.5 MiB
2025-01-08T17:50:23.538795image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length27
Mean length10.8988652
Min length1

Characters and Unicode

Total characters1409899
Distinct characters144
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2224 ?
Unique (%)1.7%

Sample

1st rowEl Cercado
2nd rowConcepción
3rd rowKamarang
4th rowArlington
5th rowManoel Urbano
ValueCountFrequency (%)
of 11716
 
5.2%
rest 8168
 
3.6%
region 8145
 
3.6%
3557
 
1.6%
de 3551
 
1.6%
district 3288
 
1.5%
columbia 3288
 
1.5%
8 3040
 
1.4%
san 2745
 
1.2%
prince 2492
 
1.1%
Other values (7516) 174499
77.7%
2025-01-08T17:50:23.785060image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 163519
 
11.6%
o 112697
 
8.0%
e 96344
 
6.8%
95127
 
6.7%
i 93298
 
6.6%
n 91747
 
6.5%
r 79497
 
5.6%
t 58541
 
4.2%
u 49822
 
3.5%
l 48733
 
3.5%
Other values (134) 520574
36.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1091009
77.4%
Uppercase Letter 196524
 
13.9%
Space Separator 95127
 
6.7%
Decimal Number 10222
 
0.7%
Other Punctuation 6418
 
0.5%
Dash Punctuation 6284
 
0.4%
Open Punctuation 2160
 
0.2%
Close Punctuation 1175
 
0.1%
Math Symbol 980
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 163519
15.0%
o 112697
10.3%
e 96344
 
8.8%
i 93298
 
8.6%
n 91747
 
8.4%
r 79497
 
7.3%
t 58541
 
5.4%
u 49822
 
4.6%
l 48733
 
4.5%
s 44464
 
4.1%
Other values (70) 252347
23.1%
Uppercase Letter
ValueCountFrequency (%)
R 24531
12.5%
C 21043
 
10.7%
S 18194
 
9.3%
M 15534
 
7.9%
A 12868
 
6.5%
P 12479
 
6.3%
B 9745
 
5.0%
D 8437
 
4.3%
N 8307
 
4.2%
L 8136
 
4.1%
Other values (34) 57250
29.1%
Decimal Number
ValueCountFrequency (%)
8 3091
30.2%
7 2342
22.9%
9 2243
21.9%
1 1445
14.1%
0 689
 
6.7%
6 126
 
1.2%
2 102
 
1.0%
3 96
 
0.9%
5 67
 
0.7%
4 21
 
0.2%
Other Punctuation
ValueCountFrequency (%)
' 1961
30.6%
. 1603
25.0%
/ 1439
22.4%
, 1370
21.3%
? 45
 
0.7%
Space Separator
ValueCountFrequency (%)
95127
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6284
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2160
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1175
100.0%
Math Symbol
ValueCountFrequency (%)
+ 980
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1287533
91.3%
Common 122366
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 163519
 
12.7%
o 112697
 
8.8%
e 96344
 
7.5%
i 93298
 
7.2%
n 91747
 
7.1%
r 79497
 
6.2%
t 58541
 
4.5%
u 49822
 
3.9%
l 48733
 
3.8%
s 44464
 
3.5%
Other values (114) 448871
34.9%
Common
ValueCountFrequency (%)
95127
77.7%
- 6284
 
5.1%
8 3091
 
2.5%
7 2342
 
1.9%
9 2243
 
1.8%
( 2160
 
1.8%
' 1961
 
1.6%
. 1603
 
1.3%
1 1445
 
1.2%
/ 1439
 
1.2%
Other values (10) 4671
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1384947
98.2%
None 24804
 
1.8%
Latin Ext Additional 148
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 163519
 
11.8%
o 112697
 
8.1%
e 96344
 
7.0%
95127
 
6.9%
i 93298
 
6.7%
n 91747
 
6.6%
r 79497
 
5.7%
t 58541
 
4.2%
u 49822
 
3.6%
l 48733
 
3.5%
Other values (62) 495622
35.8%
None
ValueCountFrequency (%)
í 4840
19.5%
á 4536
18.3%
é 4424
17.8%
ó 3619
14.6%
ã 1773
 
7.1%
ñ 1314
 
5.3%
ê 791
 
3.2%
ü 763
 
3.1%
ú 647
 
2.6%
ç 549
 
2.2%
Other values (47) 1548
 
6.2%
Latin Ext Additional
ValueCountFrequency (%)
50
33.8%
37
25.0%
14
 
9.5%
11
 
7.4%
11
 
7.4%
10
 
6.8%
3
 
2.0%
2
 
1.4%
2
 
1.4%
2
 
1.4%
Other values (5) 6
 
4.1%

level3Gid
Text

Missing 

Distinct4058
Distinct (%)11.6%
Missing953538
Missing (%)96.5%
Memory size7.5 MiB
2025-01-08T17:50:23.981775image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length15
Mean length11.74667279
Min length11

Characters and Unicode

Total characters409536
Distinct characters42
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1545 ?
Unique (%)4.4%

Sample

1st rowCHL.6.3.12_1
2nd rowPER.18.1.3_1
3rd rowCRI.4.5.4_1
4th rowECU.21.2.1_1
5th rowPER.8.9.1_1
ValueCountFrequency (%)
per.8.9.1_1 481
 
1.4%
per.18.3.4_1 344
 
1.0%
ecu.14.14.2_1 335
 
1.0%
bol.4.17.4_2 316
 
0.9%
can.6.1.8_1 291
 
0.8%
ecu.17.4.1_1 285
 
0.8%
bol.8.14.1_2 276
 
0.8%
can.13.1.35_1 214
 
0.6%
bol.4.18.2_2 207
 
0.6%
per.20.2.4_1 189
 
0.5%
Other values (4048) 31926
91.6%
2025-01-08T17:50:24.233936image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 104589
25.5%
1 65183
15.9%
_ 34863
 
8.5%
2 26503
 
6.5%
C 15795
 
3.9%
4 15291
 
3.7%
3 14173
 
3.5%
E 12203
 
3.0%
6 9734
 
2.4%
5 9308
 
2.3%
Other values (32) 101894
24.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 165485
40.4%
Other Punctuation 104589
25.5%
Uppercase Letter 104581
25.5%
Connector Punctuation 34863
 
8.5%
Lowercase Letter 14
 
< 0.1%
Dash Punctuation 4
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 15795
15.1%
E 12203
11.7%
N 8627
 
8.2%
A 8156
 
7.8%
R 7260
 
6.9%
U 7217
 
6.9%
P 6249
 
6.0%
L 6230
 
6.0%
H 5144
 
4.9%
B 4792
 
4.6%
Other values (14) 22908
21.9%
Decimal Number
ValueCountFrequency (%)
1 65183
39.4%
2 26503
16.0%
4 15291
 
9.2%
3 14173
 
8.6%
6 9734
 
5.9%
5 9308
 
5.6%
8 8312
 
5.0%
7 6139
 
3.7%
9 6133
 
3.7%
0 4709
 
2.8%
Lowercase Letter
ValueCountFrequency (%)
c 4
28.6%
a 4
28.6%
b 3
21.4%
d 2
14.3%
e 1
 
7.1%
Other Punctuation
ValueCountFrequency (%)
. 104589
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 34863
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 304941
74.5%
Latin 104595
 
25.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 15795
15.1%
E 12203
11.7%
N 8627
 
8.2%
A 8156
 
7.8%
R 7260
 
6.9%
U 7217
 
6.9%
P 6249
 
6.0%
L 6230
 
6.0%
H 5144
 
4.9%
B 4792
 
4.6%
Other values (19) 22922
21.9%
Common
ValueCountFrequency (%)
. 104589
34.3%
1 65183
21.4%
_ 34863
 
11.4%
2 26503
 
8.7%
4 15291
 
5.0%
3 14173
 
4.6%
6 9734
 
3.2%
5 9308
 
3.1%
8 8312
 
2.7%
7 6139
 
2.0%
Other values (3) 10846
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 409536
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 104589
25.5%
1 65183
15.9%
_ 34863
 
8.5%
2 26503
 
6.5%
C 15795
 
3.9%
4 15291
 
3.7%
3 14173
 
3.5%
E 12203
 
3.0%
6 9734
 
2.4%
5 9308
 
2.3%
Other values (32) 101894
24.9%

level3Name
Text

Missing 

Distinct3831
Distinct (%)11.1%
Missing953860
Missing (%)96.5%
Memory size7.5 MiB
2025-01-08T17:50:24.414891image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length28
Mean length10.60914249
Min length2

Characters and Unicode

Total characters366461
Distinct characters127
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1425 ?
Unique (%)4.1%

Sample

1st rowTomé
2nd rowManu
3rd rowSan José
4th rowAlluriquin
5th rowEcharate
ValueCountFrequency (%)
san 1730
 
3.1%
de 1393
 
2.5%
unorganized 1082
 
1.9%
la 844
 
1.5%
el 708
 
1.3%
no 616
 
1.1%
division 487
 
0.9%
echarate 481
 
0.9%
santa 470
 
0.8%
en 449
 
0.8%
Other values (4214) 48076
85.3%
2025-01-08T17:50:24.659228image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 52267
 
14.3%
n 25746
 
7.0%
o 25629
 
7.0%
i 23308
 
6.4%
21794
 
5.9%
e 21436
 
5.8%
r 18083
 
4.9%
u 14350
 
3.9%
l 13877
 
3.8%
t 11765
 
3.2%
Other values (117) 138206
37.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 281473
76.8%
Uppercase Letter 54853
 
15.0%
Space Separator 21794
 
5.9%
Other Punctuation 3254
 
0.9%
Decimal Number 1728
 
0.5%
Open Punctuation 1395
 
0.4%
Close Punctuation 1091
 
0.3%
Dash Punctuation 865
 
0.2%
Final Punctuation 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 52267
18.6%
n 25746
 
9.1%
o 25629
 
9.1%
i 23308
 
8.3%
e 21436
 
7.6%
r 18083
 
6.4%
u 14350
 
5.1%
l 13877
 
4.9%
t 11765
 
4.2%
s 8645
 
3.1%
Other values (63) 66367
23.6%
Uppercase Letter
ValueCountFrequency (%)
C 5987
 
10.9%
S 5956
 
10.9%
M 3594
 
6.6%
P 3581
 
6.5%
A 3449
 
6.3%
B 3173
 
5.8%
T 3172
 
5.8%
L 3070
 
5.6%
N 2650
 
4.8%
D 2344
 
4.3%
Other values (21) 17877
32.6%
Decimal Number
ValueCountFrequency (%)
1 735
42.5%
2 253
 
14.6%
4 166
 
9.6%
0 127
 
7.3%
3 119
 
6.9%
9 111
 
6.4%
8 65
 
3.8%
6 55
 
3.2%
5 50
 
2.9%
7 47
 
2.7%
Other Punctuation
ValueCountFrequency (%)
, 1547
47.5%
. 1501
46.1%
' 135
 
4.1%
/ 40
 
1.2%
! 15
 
0.5%
: 9
 
0.3%
" 6
 
0.2%
? 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
21794
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1395
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1091
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 865
100.0%
Final Punctuation
ValueCountFrequency (%)
8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 336326
91.8%
Common 30135
 
8.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 52267
15.5%
n 25746
 
7.7%
o 25629
 
7.6%
i 23308
 
6.9%
e 21436
 
6.4%
r 18083
 
5.4%
u 14350
 
4.3%
l 13877
 
4.1%
t 11765
 
3.5%
s 8645
 
2.6%
Other values (94) 121220
36.0%
Common
ValueCountFrequency (%)
21794
72.3%
, 1547
 
5.1%
. 1501
 
5.0%
( 1395
 
4.6%
) 1091
 
3.6%
- 865
 
2.9%
1 735
 
2.4%
2 253
 
0.8%
4 166
 
0.6%
' 135
 
0.4%
Other values (13) 653
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 363392
99.2%
None 2910
 
0.8%
Latin Ext Additional 151
 
< 0.1%
Punctuation 8
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 52267
 
14.4%
n 25746
 
7.1%
o 25629
 
7.1%
i 23308
 
6.4%
21794
 
6.0%
e 21436
 
5.9%
r 18083
 
5.0%
u 14350
 
3.9%
l 13877
 
3.8%
t 11765
 
3.2%
Other values (64) 135137
37.2%
None
ValueCountFrequency (%)
ñ 638
21.9%
é 530
18.2%
ó 426
14.6%
í 303
10.4%
á 270
9.3%
ê 263
9.0%
ü 141
 
4.8%
ú 60
 
2.1%
è 58
 
2.0%
ơ 35
 
1.2%
Other values (23) 186
 
6.4%
Latin Ext Additional
ValueCountFrequency (%)
28
18.5%
ế 28
18.5%
17
11.3%
14
9.3%
12
7.9%
10
 
6.6%
9
 
6.0%
8
 
5.3%
6
 
4.0%
4
 
2.6%
Other values (9) 15
9.9%
Punctuation
ValueCountFrequency (%)
8
100.0%

iucnRedListCategory
Text

Missing 

Distinct10
Distinct (%)< 0.1%
Missing91545
Missing (%)9.3%
Memory size7.5 MiB
2025-01-08T17:50:24.712867image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length2
Mean length2.00002453
Min length2

Characters and Unicode

Total characters1793736
Distinct characters24
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNE
2nd rowNE
3rd rowNE
4th rowNE
5th rowNE
ValueCountFrequency (%)
ne 712718
79.5%
lc 165443
 
18.4%
vu 6108
 
0.7%
en 4438
 
0.5%
nt 3884
 
0.4%
dd 2382
 
0.3%
cr 1766
 
0.2%
ew 91
 
< 0.1%
ex 26
 
< 0.1%
2024-12-02t13:56:28.527z 1
 
< 0.1%
2025-01-08T17:50:24.806923image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 721040
40.2%
E 717273
40.0%
C 167209
 
9.3%
L 165443
 
9.2%
V 6108
 
0.3%
U 6108
 
0.3%
D 4764
 
0.3%
T 3885
 
0.2%
R 1766
 
0.1%
W 91
 
< 0.1%
Other values (14) 49
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1793714
> 99.9%
Decimal Number 17
 
< 0.1%
Other Punctuation 3
 
< 0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 721040
40.2%
E 717273
40.0%
C 167209
 
9.3%
L 165443
 
9.2%
V 6108
 
0.3%
U 6108
 
0.3%
D 4764
 
0.3%
T 3885
 
0.2%
R 1766
 
0.1%
W 91
 
< 0.1%
Other values (2) 27
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 6
35.3%
1 2
 
11.8%
5 2
 
11.8%
0 2
 
11.8%
3 1
 
5.9%
4 1
 
5.9%
6 1
 
5.9%
8 1
 
5.9%
7 1
 
5.9%
Other Punctuation
ValueCountFrequency (%)
: 2
66.7%
. 1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1793714
> 99.9%
Common 22
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 721040
40.2%
E 717273
40.0%
C 167209
 
9.3%
L 165443
 
9.2%
V 6108
 
0.3%
U 6108
 
0.3%
D 4764
 
0.3%
T 3885
 
0.2%
R 1766
 
0.1%
W 91
 
< 0.1%
Other values (2) 27
 
< 0.1%
Common
ValueCountFrequency (%)
2 6
27.3%
1 2
 
9.1%
5 2
 
9.1%
: 2
 
9.1%
0 2
 
9.1%
- 2
 
9.1%
3 1
 
4.5%
4 1
 
4.5%
6 1
 
4.5%
8 1
 
4.5%
Other values (2) 2
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1793736
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 721040
40.2%
E 717273
40.0%
C 167209
 
9.3%
L 165443
 
9.2%
V 6108
 
0.3%
U 6108
 
0.3%
D 4764
 
0.3%
T 3885
 
0.2%
R 1766
 
0.1%
W 91
 
< 0.1%
Other values (14) 49
 
< 0.1%